Luck

Winning a lottery, being hit by a stray bullet, or surviving a plane crash, all are instances of a mundane phenomenon: luck. Mundane as it is, the concept of luck nonetheless plays a pivotal role in central areas of philosophy, either because it is the key element of widespread philosophical theses or because it gives rise to challenging puzzles. For example, a common claim in philosophy of action is that acting because of luck prevents free action. A platitude in epistemology is that coming to believe the truth by sheer luck is incompatible with knowing. If two people act in the same way but the consequences of one of their actions are worse due to luck, should we morally assess them in the same way? Is the inequality of a person unjust when it is caused by bad luck? These two complex issues are a matter of controversy in ethics and political philosophy, respectively.

A legitimate question is whether the concept of luck itself is worthy of philosophical investigation. One might think that it is not given (i) how acquainted we are with the phenomenon of luck in everyday life and (ii) the fact that progress has been made in the aforementioned debates on the assumption of a pre-theoretical understanding of the notion.

However, the idea that a rigorous analysis of the general concept of luck might serve to make further progress in areas of philosophy where the notion plays a fundamental role has motivated a recent and growing philosophical literature on the nature of luck itself. Although some might be skeptical that investigating the nature of luck in general can help shed some light on long-standing philosophical debates such as the nature of knowledge—see Ballantyne 2014—it is hardly sustainable that no general account of luck will be able to ground any substantive claim in areas of philosophy where the notion is constantly invoked but left undefined. This article gives an overview of current philosophical theorizing about the concept of luck itself.

Table of Contents

  1. Preliminary Remarks
    1. The Bearers of Luck
    2. The Target of the Analysis
    3. General Features of Luck
  2. Luck and Significance
  3. Probabilistic Accounts
    1. Objective Accounts
    2. Subjective Accounts
  4. Modal Accounts
  5. Lack of Control Accounts
  6. Hybrid Accounts
  7. Luck and Related Concepts
    1. Accidents
    2. Coincidences
    3. Fortune
    4. Risk
    5. Indeterminacy
  8. References and Further Reading

1. Preliminary Remarks

The following preliminary remarks will address three questions: (1) What are the bearers of luck? (2) What is the target of the analysis of current accounts of luck? (3) What general features of luck should an adequate analysis of luck be able to explain?

a. The Bearers of Luck

The best way to find out what the bearers of luck are consists in considering the kind of entities of which we predicate luck-involving terms and expressions such as “lucky,” “a matter of luck,” or “by luck.”

1. Agents. On the one hand, the term “lucky” can be predicated of agents—for example, “Chloe is lucky to win the lottery.” In general, the kind of beings to which we attribute luck are beings with objective or subjective interests such as self-preservation or desires—see Ballantyne (2012) for further discussion. In this sense, a human or a dog are lucky to survive a fortuitous rockfall, but a stick of wood or a car are not. Still, at least in some contexts, it seems correct to attribute luck to an object without interests, as when one says that one’s beloved car is lucky not to have been damaged by a fortuitous rockfall. However, this kind of assertions are felicitous insofar as they are parasitic on our interests. No one would say that a stick of wood is lucky not to have been destroyed by a rockfall if its existence bore absolutely no significance to anyone’s interests, and if one would, one would only say it figuratively.

A related question is whether the kind of agents to which we attribute luck are only individuals or whether luck can be also ascribed to collectives. There is certainly a sense in which a group of individuals can be said to be lucky, as when we say that a group of climbers is lucky to have survived an avalanche. Coffman (2007) suggests that there seems to be no reason why group luck cannot be reduced to or explained in terms of individual luck. But if one holds—with many theorists working on collective intentionality—that groups can be the bearers of intentional states, it might turn out that group luck cannot be so easily reduced to individual luck. For example, if it is by bad luck that a manufacturing company fails to achieve its yearly revenue goal—so it is bad luck for the company—it does not necessary follow that each and every one of its workers—for example, people working on the assembly line—are also unlucky, if, say, they cannot be fired by law and the company is not compromised.

2. Events. On the other hand, the term “lucky” and expressions such as “a matter of luck” or “by luck” can be predicated of events—for example, “Chloe’s lottery win was lucky”—and states of affairs—for example, “It is a matter of luck that Chloe won” or “Chloe’s winning the lottery was by luck”; see Coffman (2014) for further discussion. Plausibly, luck-involving expressions can be also predicated of items belonging to related metaphysical categories such as accomplishments, achievements, actions, activities, developments, eventualities, facts, occurrences, performances, processes, and states. For presentation purposes, luck will be here described as a phenomenon that applies to agents and events, where by “agent” is meant any being with interests and by “event” any member of the previous categories.

b. The Target of the Analysis

1. Relational versus non-relational luck. We say things such as (1) an event E is lucky for an agent S and (2) S is lucky that E. We also say things such as (3) it is a matter of luck that E and (4) E is by luck. Milburn (2014) argues that (1) and (2) are plausibly equivalent: E is lucky for S if and only if S is lucky that E. (3) and (4) also seem equivalent: it is a matter of luck that E if and only if E is by luck. However, (1) and (2) are not equivalent to (3) and (4). Milburn is right in pointing out that this marks an important distinction that anyone in the business of analyzing luck should keep in mind.

The difference between (1) and (2), on the one hand, and (3) and (4), on the other, is that (1) and (2) denote a relation between an agent and an event, whereas (3) and (4) are not indicative of any relation and only apply to events. Call the kind of luck denoted by (1) and (2) relational luck and the kind of luck denoted by (3) and (4) non-relational luck—Milburn uses different terminology: he employs the expression “subjective-relative luck” to refer to relational luck and “subjective-involving luck” to refer to non-relational luck when the relevant event concerns an agent’s action.

Relational luck can be distinguished from non-relational luck regardless of the fact that the target event is an agent’s state or action. For instance, when the relevant event is an action by the agent—for example, that S scores a goal—the luck-involving expressions in (3) and (4) apply to the agent—for example, it is a matter of luck that S scores a goal—but fail to establish a relationship between the agent—S—and the event—S’s scoring of a goal. In contrast, if the target event is the agent’s action, (1) and (2) do establish a relationship between the agent and her action—for example, that S scores a goal is lucky for S.

In the literature, most accounts of luck try to explain what it takes for an event to be lucky for an agent. In other words, they focus on relational luck. But it might well be that in order to shed light on the special varieties of luck—for example, epistemic, moral, distributive luck—one might need to shift the focus of the analysis to non-relational luck—see Milburn (2014) for further discussion.

2. Synchronic versus diachronic luck. Most accounts of (relational and non-relational) luck focus on when an event is lucky—for an agent or simpliciter—at one point in time. However, Hales (2014) argues that luck may be predicated not only synchronically—that is, of an event’s occurrence at a certain time—but also diachronically—that is, of a series or streak of events occurring at different times. For example, synchronically, we say things such as “Joe was lucky to hit the baseball at the end of the game.” Diachronically, we say things such as “Joe was lucky to safely hit in 56 consecutive baseball games.” Hales’s point is that we can be lucky diachronically but not synchronically, and the other way around. By contrast, McKinnon (2013; 2014) argues that while we can determine the presence and degree of diachronic luck—for example, luck in a streak of successful performances—we do not have the ability to determine the presence of synchronic luck—that is, whether a concrete performance is by luck.

3. Strokes of luck. An important departure from standard analysis of relational and non-relational luck is Coffman (2014; 2015), who thinks that the notion of an event E being a stroke of luck for an agent S is more fundamental than the notion of E being lucky for S—or more simply, than the notion of lucky event—and that, therefore, the former should be the target of the analysis of any adequate account of luck. Nonetheless, Coffman’s account of strokes of luck features the same kind of conditions that other authors give in their analyses of the notion of lucky event. In view of this, Hales (2015) objects that Coffman’s approach unnecessarily adds an extra layer of complexity to the already complex analysis of luck and casts doubt on how an analysis of the notion of stroke of luck can shed any more light than an analysis of the notion of lucky event in those areas of philosophy where the concept of luck plays a significant role.

c. General Features of Luck

Before entering into further details, it is convenient to highlight three general features of luck that any adequate analysis of the concept should be able to explain.

Goodness and badness. Luck can be good or bad. This is clearly true of relational luck. For instance, we say things such as “Dylan was lucky to survive the car accident” or “Dylan was unlucky to die in the car accident” to mean, respectively, that it is good luck that he survived and bad luck that he died. Moreover, one and the same event can be both good and bad luck for an agent, which plausibly has to do with the fact that two or more interests of the agent are at stake—Ballantyne (2012). For example, losing one’s keys and having to spend the night outdoors is bad luck if one gets a cold as a consequence, but it is also good luck if one thereby avoids an explosion in one’s apartment.

By contrast, attributions of non-relational luck not so clearly convey good or bad luck—for example, “The discovery of Pluto was a matter of luck.” This is plausibly due to the fact that such attributions do not denote any relationship between a lucky event and an agent or group of agents. To put it differently, if we interpret that sort of attributions as conveying good or bad luck, it is probably because we read them as denoting such relationship. At any rate, accounting for why luck is good or bad is a desideratum at least for analyses of relational luck.

Finally, although the term “lucky” is ordinarily associated with good luck, in the philosophical literature, it is used to denote events that instantiate good luck as well as events that instantiate bad luck. This is done mainly for the sake of simplicity.

Vagueness. Luck is to some extent a vague notion. Not all instances of luck are as clear-cut as a lottery win. For example, goals from the corner kick in professional soccer matches are considered neither clearly lucky nor clearly produced by skill. Pritchard (2005: 143) gives another example: if someone drops her wallet, keeps walking and after five minutes realizes that she just lost her wallet, returns to the place where she dropped it and finds it, is that person lucky to have found her wallet? The answer is not clear. Accordingly, we should not expect an analysis of luck to remove this vagueness. On the contrary, an adequate account should predict borderline cases, that is, cases that are neither clearly lucky nor clearly non-lucky. This is a desideratum for accounts not only of relational luck but also of non-relational luck.

Gradualness. Luck is a gradual notion. In ordinary parlance, it is common to attribute different degrees of luck to different events. For example, winning one million dollars playing roulette is luckier than winning one dollar, even if the odds are the same. Interestingly, winning the prize of an ordinary lottery is luckier than winning the same amount of money by tossing a coin, that is, when the odds are lower. An adequate analysis of luck should be also able to account for these different differences in degree. Again, this concerns accounts of relational luck as well as of non-relational luck.

2. Luck and Significance

Several atomic nuclei joining and triggering off an explosion is an event that is neither lucky nor unlucky for anyone if it happens at the other end of the galaxy. But it is bad luck if the explosion takes place nearby. One way to account for the difference in luckiness is that while the former event is not significant to anyone, the latter is significant to whoever is nearby. Cases like this motivate philosophers who theorize about the concept of luck to endorse a significance condition, that is, a requirement to the effect that an event is lucky for an agent only if the event is significant to the agent.

Since the significance condition establishes a relationship between an agent and an event, whether one thinks that such a condition is needed or not depends on what the target of one’s account is. For instance, if one is in the business of analyzing relational luck, one will be willing to include a significance condition in one’s analysis. But if one’s aim is to account for non-relational luck instead—that is, when is an event lucky simpliciter—one will be reluctant to include such condition in one’s analysis—see Pritchard (2014) for further discussion.

Although there is a wide agreement that an adequate analysis of relational luck must include a significance condition, there is a significant disagreement on its specific formulation. Pritchard (2005: 132–3) formulates the significance condition as follows:

S1: An event E is lucky for an agent S only if S would ascribe significance to E, were S to be availed of the relevant facts.

S1 requires that lucky agents have the capacity to ascribe significance. But that is problematic insofar as the condition prevents sentient nonhuman beings (Coffman 2007) and human beings with diminished capacities like newborns or comatose adults (Ballantyne 2012) from being lucky.

Coffman (2007) proposes an alternative significance condition in terms of the positive or negative effect of lucky events on the agent:

S2: An event E is lucky for an agent S only if (i) S is sentient and (ii) E has some objective evaluative status for S—that is, E has some objectively good or bad, positive or negative, effect on S.

Ballantyne (2012) gives a counterexample to S2 by arguing, first, that (ii) should be read as follows:

(ii)* E has some objectively positive or negative effect on S’s mental states.

The reason given by Ballantyne is that if the event’s effect is not on the agent’s mental states, it is not obvious why clause (i) is required. With that in place, the counterexample to S2 goes as follows: an unlucky man has no inkling that scientists have randomly selected him to put his brain in a vat to feed his neural connections with real-world experiences. The case is allegedly troublesome for S2 because the event, which is bad luck for the man, has no impact on the man’s mental states and, in particular, on his interior life, which is not altered.

A reply might be that, although the fact that the man’s brain is put in a vat does not affect the man’s interior life and namely his phenomenal mental states, it certainly affects his representational mental states. In particular, most of them turn out false, which seems to be objectively negative for the man, just as S2 requires.

Ballantyne (2012) proposes an alternative formulation of the significance condition in terms of the positive or negative effect of lucky events on the agent’s interests:

S3: An event E is lucky for an agent S only if (i) S has a subjective or objective interest N and (ii) E has some objectively positive or negative effect on N—in the sense that E is good or bad for S.

S3 is more specific than S2 in the kind of attributes that are supposed to be positively or negatively affected by lucky events. While S2 does not say whether these need to be the qualitative states of sentient beings, or their representational states, or their physical condition, S3 is explicit that what lucky events affect are the subjective and objective interests of individuals.

Leaving aside the question of what the correct formulation of the significance condition is, it is interesting to see how a significance condition can help explain the three general features of luck outlined above, that is, the goodness, badness, vagueness, and gradualness of luck. Concerning goodness and badness, the explanation is straightforward: luck is good or bad because the significance that lucky events have for people is positive or negative. Concerning vagueness, significance is a vague concept, so including a significance condition in an analysis of luck at least does not remove its inherent vagueness. Concerning gradualness, it can be argued that the degree of luck of an event proportionally varies with its significance or value—Latus (2003), Levy (2011: 36), Rescher (1995: 211–12; 2014). Consider the previous example of winning one million dollars playing roulette versus winning one dollar when the probability of winning is the same: it can be simply argued that the former event is luckier than the latter because it is more significant.

3. Probabilistic Accounts

Paradigmatic lucky events—for example, winning a fair lottery—typically occur by chance. Probabilistic accounts of luck explicitly appeal to the probability of an event’s occurrence to explain why it is by luck. In addition, they typically include a significance condition to explain why events are lucky for agents. For discussion purposes, the analyses of luck below will be presented as analyses of significant events, so the relevant significance condition can be omitted.

a. Objective Accounts

Some accounts make use of objective probabilities to define luck, that is, the kind of probabilities that are not determined by an agent’s evidence or degree of belief, but by features of the world:

OP1: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low probability that E would occur at t.

OP1 says that lucky events are events whose occurrence was not objectively likely. A related way to formulate a probabilistic view—suggested by Baumann 2012—is by means of conditional objective probabilities:

OP2: A significant event E is lucky for an agent S at time t if only if, prior to the occurrence of E at t, there was low objective probability conditional on C that E would occur at t.

C is whatever condition one uses to determine the probability that the event will occur. For example, the unconditional probability that Lionel Messi will score a goal in the soccer match is high but given C—the fact that he is injured—the probability that he will score is low. Suppose that Messi ends up scoring by luck. The condition helps explain why: he was injured and therefore it was not very likely that we would score.

According to Hales (2014), probabilistic views of luck such as OP1 or OP2 are the most widespread among scientists and mathematicians. But they face at least two problems. First, a dominant—although not undisputed—idea is that necessary truths have probability 1. In view of it, Hales (2014) argues that probabilistic analyses cannot account for lucky necessities, which are maximally probable. For example, he contends that organisms—humans included—are lucky to be alive because the gravitational constant, G, is the one that actually is, but the probability that G made life possible is 1.

Second, another problem for probabilistic accounts is that, although rare, there are highly probable lucky events, that is, lucky events whose occurrence is highly probable—see Broncano-Berrocal (2015). Suppose that someone is the most wanted person in the galaxy and that billions of mercenaries are trying to kill her, but also that her combat skills drastically reduce the probability that each independent assassination attempt will succeed. Suppose that one such an attempt succeeds for completely fortuitous reasons that have nothing to do with the exercise of her skills. That she is killed is obviously bad luck, but it was also very probable given how many mercenaries were trying to kill her: even if each killing attempt had low probability to succeed, the probability that at least one would succeed was high given the number of independent attempts—that is, the probability of the disjunction of all attempts was high. This shows, contrary to what OP1 and OP2 say, that luck does not entail low probability of occurrence.

OP1 and OP2 are analyses of synchronic luck. McKinnon (2013; 2014) proposes a probabilistic account of diachronic luck instead. The view, called the expected outcome view, starts with the observation that we can determine the expected objective ratio of many events, including people’s performances. By way of illustration, the expected ratio of flipping a coin is 50 percent tails and 50 percent heads. On the other hand, the expected ratio of a certain basketball player’s free-throw shots being successful might be of 90 percent. However, in real life series of tosses or free-throws shots the outcomes typically deviate from those values. In the light of these considerations, McKinnon proposes the following view:

OP3: For any series A of events (E1, E2, …, En) that are significant to an agent S and for any objective expected ratio N of outcomes for events of type E, S is lucky proportionally to how much the actual ratio of outcomes in A deviates from N.

In a nutshell, McKinnon’s view is that we attribute any deviation from the expected ratio of outcomes to luck, and namely to good luck—if the deviation is positive—and to bad luck—if the deviation is negative. If the actual ratio is as expected, the ratio is fully attributable to skill. One key element of McKinnon’s view—and the reason why she rejects any attempt to give an account of synchronic luck—is that she thinks that, while we can know that the set of outcomes that deviate from the expected ratio are due to luck, we cannot know which one of the outcomes in that set is by luck. In other words, we can know whether we are diachronically lucky, but not whether we are synchronically lucky.

Before turning to a different type of probabilistic accounts, let us see how accounts modeling luck in terms of objective probability explain the three general features of luck outlined above. On the one hand, they can explain why luck is a gradual notion in a natural way. For instance, Rescher (1995: 211–12; 2014) thinks that luck varies with not only significance but also chance. If S is the value or significance of an event E, how lucky E is can be determined, according to Rescher, as follows:

Luck = S × (1 – Prob[E]).

In other words, Rescher thinks that luck varies proportionally with the value or significance that the event has for the agent and inversely proportionally with the probability of its occurrence.

On the other hand, defenders of objective probabilistic views might in principle explain why luck is vague notion in epistemic terms. They might argue that knowing exactly how lucky someone is with respect to an event entails that the exact probability of the event’s occurrence is known. However, the relevant probabilities are typically unknown are, at best, approximately known, which might in principle help explain why, say, a goal from the corner kick is neither clearly lucky nor clearly produced by skill: prior to its occurrence, the probability that it would occur was unclear.

Finally, as we have seen, McKinnon thinks that her view also helps explain why luck is good or bad: luck is good or bad depending on whether the actual deviation from the expected ratio is positive or negative.

b. Subjective Accounts

A different way to model luck in probabilistic terms is by means of subjective probabilities, that is, the kind of probabilities that are determined by an agent’s evidence or degree of belief. One way to state this kind of view is that whether or not an event counts as lucky for an agent depends on the agent’s degree of belief in the occurrence of the event, that is, on how confident she is or how strongly she believes that the event will occur—see Latus (2003), Rescher (1995: 78–80), and Steglich-Petersen (2010) for relevant discussion. More precisely:

SP1: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S had a low degree of belief that E would occur at t.

A subjective probabilistic account might be also formulated in terms of the agent’s evidence for the occurrence of the event—see Steglich-Petersen (2010):

SP2: A significant event E is lucky for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, there was low probability that E would occur at t.

SP1 and SP2 characterize luck as a perspectival notion: if for A but not for B it is subjectively improbable that an event E will occur, then, if E occurs, E is lucky for A but not for B—Latus (2003) endorses this thesis. For example, suppose that someone receives a big check from a secret benefactor. From that person’s perspective, it is good luck that she has received the check, but from the perspective of the benefactor, it is not—the example is from Rescher (1995: 35). In addition, those who firmly believe in fate or whose evidence strongly points to its existence are never lucky according to these views, because everything that happens to them is highly probable from their perspective.

Stoutenburg (2015) gives a similar evidential account of degrees of luck. The idea is that an agent is lucky with respect to an event to the extent that her evidence does not guarantee its occurrence, in the sense that if the conditional probability of the occurrence of the event given the agent’s evidence is not maximal, she is lucky to some degree with respect to that event:

SP3: A significant event E is lucky to some degree for an agent S at time t if only if, given S’s evidence just before the occurrence of E at t, the probability that E would occur at t is not 1.

A problem for views such as SP1, SP2, and SP3 is that events are no less lucky if we have no evidence or have not thought about them—see Steglich-Petersen (2010). For example, someone would be clearly lucky if, unbeknownst to her, a bullet just missed her head by centimeters. Steglich-Petersen (2010) thinks that one way to fix this problem is to formulate a subjective view in terms of the agent’s total knowledge instead of her degree of belief or evidence for the occurrence of the event:

SP4: A significant event E is lucky for an agent S at time t if only if, for all S knew just before the occurrence of E at t, there was low probability that E would occur at t.

SP4 is compatible with an event being lucky for the agent when she has no prior evidence or doxastic state about its occurrence. But SP4 might still not yield the right results. Consider a macabre lottery in which all the participants have been poisoned and the only way to survive is to win the prize, which is the antidote. The lottery draw is a fair one, so surviving is a pure matter of chance. Suppose that the only difference in knowledge between two participants, A and B, is that only A knows of herself that has been poisoned and is a participant of the lottery. For all A knows, there is low probability that she will survive. In contrast, for all B knows, her survival is very likely—she is a healthy person and has no reason to think that she has been poisoned. According to SP4, B would not be lucky if she won the lottery and survived as a result. Intuitively, however, A and B would be equally lucky if they won the lottery.

In general, this and other cases might be taken to illustrate that what is apparently lucky does not always coincide with what is actually lucky—see Rescher (2014) for the distinction between apparent and actual luck. A potential problem for subjective views is then that they might be only capturing intuitions about the former.

Steglich-Petersen (2010) advances a different account, which is not probabilistic in nature, but which is worth considering in this section, not only because it is a natural development of SP4, but also because, like SP2, SP3, and SP4, it characterizes luck as an epistemic notion. In particular, it analyzes luck in terms of the agent’s epistemic position with respect to the future occurrence of the lucky event:

SP5: A significant event E is lucky for an agent S at time t if only if, just before the occurrence of E at t, S was not in a position to know that E would occur at t.

Steglich-Petersen explains that we are in a position to know that an event will occur if, by taking up the belief that the event will occur, we thereby know that it will occur. SP5 yields the correct result in the macabre lottery case, which was troublesome for SP4. None of the participants is in a position to know that they would win the lottery and survive as a result. For that reason, the winner is lucky.

However, SP5 might not capture the intuitions of other cases correctly. Suppose that someone is the holder of a ticket in a fair lottery. During the lottery draw, a Laplacian demon predicts and tells that person that she will be the winner, so she comes to know in advance—and therefore is in a position to know—that she will be the winner. However, that person is not less lucky to win the lottery because of that knowledge or because of being in that position. After all, it is still a coincidence that she has purchased the ticket that corresponds to the accurate prediction of the demon. In sum, knowing that one will be lucky—and therefore being in a position to know it—does not necessarily prevent one from being lucky.

Before considering an alternative approach to luck, let us see how subjective probabilistic accounts explain the three general features of luck presented at the beginning of the article. On the one hand, they can account for degrees of luck in terms of degrees of subjective probability. As we have seen, SP3 says that an agent is increasingly lucky with respect to an event the less likely the occurrence of the event—conditional on her evidence—is. On the other hand, advocates of the subjective approach might explain borderline cases of luck by appealing to the fact that the relevant subjective probabilities are not always transparent, so if we cannot determine whether an event is lucky or non-lucky, it is plausibly because the relevant subjective probabilities cannot be determined either. Finally, to explain why luck is good or bad defenders of subjective accounts can simply include a significance condition on luck in their analyses.

4. Modal Accounts

A different approach to luck emphasizes the fact that paradigmatic instances of luck such as lottery wins could have easily failed to occur. Modal accounts accordingly explain luck in terms of the notion of easy possibility. As usual in areas of philosophy where the notion of possibility is invoked, advocates of the modal approach use possible worlds terminology to explain that notion and in turn the concept of luck. In this sense, that a lucky event could have easily not occurred means that, although it occurs in the actual world, it would fail to occur in close possible worlds.

Closeness is simply assumed to be a function of how intuitively similar possible worlds are to the actual world. For example, if an event E occurs at time t in the actual world, close possible worlds can be obtained by making a small change to the actual world at t and by seeing what happens to E at t or at times close to t—see Coffman (2007; 2014) for relevant discussion. One should keep in mind that although current modal views are closeness views, it is in principle possible to give a modal account of luck that ranges over distant possible worlds.

In the literature, there can be found several formulations of modal conditions on luck, where the main point of disagreement concerns the proportion of close possible worlds in which an event needs not occur in order for its actual occurrence to be by luck. For discussion purposes, however, those conditions will be presented here as if they constituted full-fledged analyses of luck, but it is important to keep in mind that modal conditions are typically considered necessary but not sufficient for a significant event to be by luck. A prominent exception is Pritchard (2005), who is the only author in the literature advocating a pure modal account of luck—in more recent work (2014), he drops the significance condition from his analysis, plausibly because he is mainly interested in giving an account of non-relational luck. Also for discussion purposes, the analyses of luck below will be presented, as before, as analyses of significant events. Without further ado, let us consider the following modal account by Pritchard (2005: 128):

M1: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a wide proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

According to M1, one is lucky to win a fair lottery because in a wide class of close possible worlds one would lose. M1 has two important features. The first one is that it does not consider any close possible world relevant to determine whether an event is lucky or not: only those in which the relevant initial conditions are the same as in the actual world. According to Pritchard (2014), the relevant initial conditions for an event are specific enough to allow a correct assessment of the luckiness of the target event, but not so specific as to guarantee its occurrence. Nonetheless, Pritchard leaves as a contextual matter what features of the actual world need to be fixed in our evaluation of close possible worlds. For instance, when we assess the modal profile of lottery results, we typically keep fixed features such as the fairness and the odds of the lottery or the fact that one has decided to purchase a specific lottery number.

Riggs (2007) argues that M1 is defective precisely because there is no non-arbitrary way to fix the relevant initial conditions. In reply, Pritchard (2014) argues that an analysis of a concept should not be more precise than the concept that the analysis intends to account for. Given that luck is a vague notion, the somewhat vague clause on initial conditions might be after all doing some explanatory work.

The second important feature of M1 is that it requires that the lucky event fails to occur in a wide proportion of close possible worlds. Pritchard (2005: 130) explains that by “wide” he means at least approaching half the close possible worlds, where events that are clearly lucky would not obtain in most close possible worlds.

However, there are clearly lucky events, such as obtaining heads by flipping a coin, that would not occur in a large proportion of close possible worlds—since the probability of heads is 0.5, we can suppose that in half the close possible worlds the outcome would be still heads. Perhaps, the following slightly different formulation is to be preferred—see Coffman (2007):

M2: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in at least half the close possible worlds in which the relevant initial conditions for E are the same as in the actual world.

However, Levy (2011: 17–18) argues that if we accept that an event that does not occur in half the close possible worlds is lucky, we can also accept that an event that does not occur in little less than half the close possible worlds—for example, in 49 percent of them—is lucky as well. In view of this, Levy thinks that it is better not to commit one’s modal account to a precise view of the issue. Instead, Levy argues that there is no fixed proportion of close possible worlds where an event must not occur to be considered lucky in the actual world. His point is that there might be different “large enough” proportions of close possible worlds in which events need not occur to be considered lucky. According to Levy, what makes the threshold vary from case to case is the significance that the event has for the agent. A modal account in the spirit of Levy’s considerations would be then the following:

M3: A significant event E is lucky for an agent S at time t if only if E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds in which the relevant initial conditions for E are the same as in the actual world, where the relevant proportion of close possible worlds is determined by the significance that E has for S.

Lackey (2008) raises two important objections to the modal approach. The first one challenges the idea that the easy possibility of an event not occurring is necessary for luck. She proposes a counterexample involving a modally robust lucky event. Suppose that (i) A buries a treasure at location L and that (ii) B independently places a plant in the ground of L. When digging, B discovers A’s treasure. Lackey’s point is that if we stipulate that A’s and B’s independent actions are sufficiently modally robust, in the sense that there is no chance that they would fail to occur in close possible worlds, B’s discovery, which is undeniably lucky, would occur in most close possible worlds.

Pritchard (2014) and Levy (2009) try to circumvent the objection in two steps. First, they distinguish between the notions of luck and fortune. Then, they propose an error theory according to which most people would be mistaken to say that B’s discovery is by luck: B’s discovery is in reality fortunate, not lucky—see section 7 for the specific way in which Pritchard and Levy distinguish luck from fortune.

Lackey’s second objection targets the idea that the easy possibility of an event not occurring is sufficient for luck. Lackey thinks that whimsical events—that is, events that result from actions that are done on a whim—show exactly this. For instance, suppose that someone decides to catch the next flight to Paris on a whim. That person’s going to Paris is not by luck—since it is the result of her self-conscious decision—but it would nevertheless fail to occur in most close possible worlds—since she has made the decision on a whim.

In reply, Broncano-Berrocal (2015) argues that Lackey’s objection obviates the clause on initial conditions of modal accounts: if someone decides to go—and goes—to Paris on a whim, close possible worlds in which the relevant initial conditions for that trip are the same as in the actual world—that is, the only possible worlds that according to modal views are relevant to assess whether the trip is by luck—are worlds in which that person makes the decision to go to Paris. But, consistently with what modal accounts say, that person goes to Paris in most of those worlds. In a similar way, when it comes to evaluating whether someone in possession of a specific ticket is lucky to have won the lottery, we only consider close possible worlds in which she has decided to buy that specific ticket. Again, in most of those worlds, that ticket is a loser, just as modal accounts predict.

On the other hand, Hales (2014) thinks that cases of lucky necessities are problematic not only for objective probabilistic accounts but also for modal views. For example, if Jack the Ripper is terrorizing the neighborhood and it is one’s dearest friend Bob knocking on one’s door, one might be lucky that Bob is not Jack the Ripper, but it is metaphysically impossible that Bob is Jack the Ripper because things are self-identical—Hales gives credit to John Hawthorne for the example.

Before turning to lack of control views, let us see how modal accounts explain the three general features of luck. Concerning goodness or badness, modal views can simply include a significance condition—although, as noted, Pritchard (2014), one of the main advocates of the modal approach, thinks that a significance condition is not necessary for luck. In addition, we have seen that the clause on the relevant initial conditions of the event is vague enough to preserve the characteristic vagueness of the concept of luck.

On the other hand, modal views have at least two interesting ways to account for degrees of luck—the terminology below is from Williamson (2009), who applies it to the safety condition for knowledge. M1, M2, and M3 adopt what can be called the proportion view of the gradualness of luck: they cash out the degree of luck of an event in terms of the proportion of close possible worlds in which it would fail to occur—the larger the proportion of such close possible worlds is, the luckier the event is. Church (2013) argues that the proportion view should not be restricted to close possible worlds only: degrees of luck should be modeled in terms of all relevant possible worlds, although he also argues that more weight should be given to close ones.

The idea that more weight should be given to some possible worlds when fixing the degree of luck of an event serves to stipulate a different view of the gradualness of luck. The view, which can be called the distance view, says that the degree of luck of an event varies as a function of the distance to the actual world of possible worlds in which it would fail to occur. In this way, the closer those worlds are, the luckier the event is—Pritchard (2014) endorses the distance view.

On a related note, modal theorists can explore the relation between the significance of a lucky event and its modal profile. As we have seen, Levy (2011) thinks that the size of the proportion of close possible worlds in which an event needs not occur to count as lucky is sensitive to the significance that the event has for the agent. Although Levy thinks that it is a mistake to seek much clarity about how the latter affects the former, he also believes that there is a relation of inverse proportionality between the two: the more significant an event for an agent is, the smaller needs to be the proportion of close possible worlds in which it would not occur to be considered lucky for the agent—Coffman (2014) calls this the inverse proportionality thesis; see Levy (2011: 36).

By way of illustration, compare surviving a round of Russian roulette with one bullet in the chamber of a revolver with a six-shot capacity—approximately 0.16 probability of being shot—with winning one dollar in a poker game after having called an all-in that one knew one only had a 0.16 probability of losing. In both cases, one would succeed—that is, one would survive or win—in most close possible worlds, but only the former case is considered clearly lucky. The inverse proportionality thesis accounts for the difference: surviving is such a significant event that the proportion of close possible worlds in which one dies needs not be large for one’s actual survival to be considered lucky. However, Coffman (2015: 40) argues that the thesis is not sustainable precisely because it leads to the result that all extremely significant events count as lucky if there is at least a small non-zero chance that they will not happen—for example, the thesis seems to entail that we are lucky to survive every time we take a flight.

5. Lack of Control Accounts

One of the most widespread intuitions about luck is that lucky events are events beyond our control. For example, one way to explain why we are lucky to win the lottery is that the outcome of the lottery is beyond our control. In the literature, different lack of control views account for luck in those terms.

Some authors give pure lack of control accounts—for example, Broncano-Berrocal (2015), Riggs (2009). Other authors think that lack of control conditions are necessary but not sufficient for significant events to be by luck—for example, Coffman (2007; 2009), Latus (2003), Levy (2009; 2011). As in the case of modal conditions, and mainly for discussion purposes, the latter will be presented as if they constituted full-fledged analyses of luck—also as before, the analyses will be presented as analyses of significant events. That said, the simplest lack of control account has the following form:

LC1: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t.

Many lucky events are beyond our control, so LC1 seems to be on the right track. However, Lackey (2008) argues that the fact that a significant event is beyond our control is neither necessary nor sufficient for the event being lucky. Against the sufficiency claim, Lackey argues that many nomic necessities—for example, sunrises—are not under our control, but that does not mean that they are by luck—see also Latus (2003) for this objection. To prove that lack of control is not necessary for luck, Lackey proposes a case in which a demolition worker, A, succeeds in demolishing the warehouse she was planning to demolish when pressing the button of the demolition system she had designed to that effect only because the electrical current is accidentally restored after the damage caused by a mouse when chewing the connection wires. According to Lackey, the explosion is both under A’s control and by luck.

Coffman (2009) and Levy (2011), who think that lack of control is not sufficient for luck, argue that Lackey’s counterexample to the necessity claim rests on the false thesis—called by Coffman the luck infection thesis—that if luck affects the conditions that enable an exercise of control, then the exercise of control itself is by luck; more generally, if S is lucky to be in a position to ϕ and S ϕ-es, then S ϕ-es by luck. The thesis, according to Coffman, has blatant counterexamples. For example, a lifeguard who accidentally goes to work very early and sees a swimmer drowning is lucky to be in a position to save the swimmer, but if done competently, it is not by luck that she saves him.

To overcome this and other objections, lack of control theorists define the notion of control in different ways. For example, Coffman (2009) thinks that an event is under an agent’s control just in case she is free to do something that would help produce it and something that would help prevent it. Rescher (1969: 329) gives a similar account of control as the capacity to produce the occurrence of an event—what Rescher calls positive control—and the capacity to prevent it—what he calls negative control. While Rescher defends a probabilistic account of luck, Coffman thinks that lack of both negative and positive control—when understood in terms of freedom—is necessary for luck. The following is a lack of control view in the spirit of Coffman’s and Rescher’s respective conceptions of control:

LC2: A significant event E is lucky for an agent S at time t if only if S is not both free to do something that would help produce E at t—or lacks the capacity to do it—and free to do something that would help prevent E at t—or lacks the capacity to do it.

An immediate problem for LC2 is that it is not the same to have control as to exercise it. We might have control over something in the sense that we are free or have the capacity to control it, but that does not mean that we actually exercise that capacity or freedom. For example, a competent pilot who is free or has the capacity to produce and prevent a plane crash but who refuses to take control of the plane for some reason is objectively lucky that a passenger manages to land the plane safely and that as a result survives.

Levy (2011: chap. 5) understands control in similar terms as Coffman and Rescher, but he introduces additional epistemic constraints. For Levy, an event is under an agent’s control just in case there is a basic action that she could perform which she knows would bring about the event and how it would do so. This way to understand control can be supplemented with Rescher’s point that agents can also control an event by inaction, omission or inactivity (Rescher 1969: 369). Taking the latter into account, the following is a pure lack of control view in the spirit of Levy’s conception of control:

LC3: A significant event E is lucky for an agent S at time t if only if S is able to perform—or to omit performing—a basic action whose occurrence—or non-occurrence—is such that S knows would bring about—or prevent—E at t and how it would do so.

According to LC3, if we do not want to be exposed to the whims of luck not only we have to be able to perform—or omit performing—actions that causally influence the world, but we also need to know that, and how, the world is sensitive to them.

A potential problem for LC3 is that we might be properly described as being in control of something when we act in a way that brings it to a desired state despite we do not know how exactly this happens. For example, a driver might know that by turning the steering wheel to the left she will avoid an obstacle in the road, but she might be completely mistaken about how exactly this works—for example, she might erroneously believe that, whenever she turns the steering wheel to the left, it is a magical dwarf who moves the car to the left. So, she knows that her basic action will bring about the desired effect while failing to know how. The problem is that if that person competently avoids the obstacle, the maneuver seems under her control, no matter that she mistakenly thinks that it is under the dwarf’s.

A different lack of control account is due to Riggs (2009), who tries to defend the lack of control approach from Lackey’s objection that the fact that an event is beyond our control does not suffice for the event being lucky. Riggs admits that although it is true that many nomic necessities—for example, sunrises—are beyond our control, we can still exploit them to our advantage. The idea is that if we exploit them for some purpose, they are not lucky for us even if they are not under our control. The following analysis accounts for luck in those terms:

LC4: A significant event E is lucky for an agent S at time t if only if (i) E is beyond S’s control at t and (ii) S did not successfully exploit E, prior to E’s occurrence at t, for some purpose.

To illustrate how LC4 can distinguish between lucky and non-lucky physical events beyond our control, Riggs proposes a case in which two people, A and B, are about to be executed, but only A knows two important facts: first, that their captors believe that solar eclipses are in reality a message from the gods telling them to stop sacrifices; second, that, unbeknownst to their captors, a solar eclipse will take place at the exact time the execution is planned. Riggs thinks that, while B is lucky to be released, A is not. By being in a position to exploit the eclipse in her favor, A is in control of the situation.

Coffman (2015: 10) argues via counterexample that LC4 does not distinguish correctly between lucky and non-lucky physical events beyond our control. He proposes a case in which someone lives in an underground facility that is, unbeknownst to her, solar-powered. According to Coffman, that person, who has become completely oblivious to sunrises, is not lucky that the sun rises every morning and keeps her facility running, even if it is something that is neither beyond her control, nor successfully exploited by her for some purpose.

Broncano-Berrocal (2015) gives a lack of control account in the spirit of Riggs’s, but with significant differences. According to Broncano-Berrocal, there are two ways in which something might be under our control. On the one hand, we exercise effective control over something by competently bringing it to a desired state—for example, by causally influencing it in a certain way. On the other hand, something is under our tracking control when we actively check or monitor that it is currently in a certain desired state, so that we are thereby disposed or in a position either (i) to exercise effective control over it or (ii) to act in a way that would allow us to achieve goals related to the thing controlled—for example, exploiting it to our advantage. By way of illustration, when flying on autopilot mode, a pilot does not exercise effective control over the plane—for example, she does not exert any causal influence on it—but the plane is under her tracking control if she is sufficiently vigilant. A key point of Broncano-Berrocal’s account is that, depending on the practical context, attributions of control such as “Event E is under S’s control” might refer either to effective control, to tracking control, or to both. The corresponding account of luck is the following:

LC5: A significant event E is lucky for an agent S at time t if only if E is beyond S’s control at t, where E is beyond S’s control at t either if (i) S lacks effective control over E, or (ii) E is not under S’s tracking control, or (iii) both.

Lotteries are typically not under our tracking control—although they might be if a Laplacian demon tells us what the result will be. The reason why winning a fair lottery is a matter of luck is, according to LC5, that we are not able to causally influence the result in the desired way, that is, the fact that we lack effective control. By the same token, LC5 also considers lucky winning a lottery that, unbeknownst to one, has been rigged in one’s favor.

LC5 allows to give a different response to Lackey’s demolition case: Lackey’s intuition that the explosion is under A’s control can be explained in terms of the fact that A exercises effective control over the explosion by pressing the button. But the intuition that A is lucky to demolish the warehouse is parasitic on the fact that the explosion is not under A’s tracking control. In particular, the practical context provided by Lackey is such that A is responsible for the design of the demolition system but fails to check that the connection wires are damaged—sometimes, tracking control might be very difficult to achieve. In a similar way, LC5 explains that, while we lack effective control over many physical events—for example, sunrises—the reason why they are not lucky is that they are under our tracking control, that is, they are things that we regularly monitor and thereby can exploit to our advantage.

Coffman’s solar-powered facility case, the counterexample to LC4, is also a counterexample to LC5. Coffman’s point is that sunrises are not lucky for the person living in the solar-powered underground facility, despite they are not under her control—tracking or effective. In reply, defenders of lack of control views might argue that it is not unreasonable to say that such a person is lucky that the sun rises every morning and keeps, unbeknownst to her, her facility running. After all, there are similar attributions of luck in ordinary speech. For example, we say things such as “S is lucky to live in an earthquake-free region” even though S ignores it and is therefore lucky that an earthquake will not make her house collapse.

Finally, Hales (2014) thinks that there are cases of skillful achievements that lack of control accounts are compelled to consider lucky. For instance, he thinks that not even the best batter in history can plausibly be said to have control over whether he hits the ball, since there are many factors over which he cannot exercise any sort of control—for example, distractions, the pitches he receives, and the play of the opposing fielders. In reply, lack of control theorists might argue that Hales is illicitly raising the standards of control. After all, intuitions about whether the result of our actions is under our control go hand in hand with intuitions about whether the result of our actions is because of our skills.

As a final note, let us briefly consider how lack of control accounts explain the three general features of luck presented at the beginning of the article. Concerning goodness or badness, lack of control views can, like other views, simply include a significance condition. Concerning vagueness, the notion of control is not as precise as to remove all vagueness from the analysis of luck. Concerning gradualness, control, like luck, comes in degrees. In particular, lack of control of theorists might endorse the view that the degree of luck of an event is inversely proportional to the degree of control that the agent has over it—see Latus (2003) for further discussion.

6. Hybrid Accounts

Some authors opt for giving accounts of luck that mix modal or probabilistic conditions with lack of control conditions. The rationale behind this move is, as Latus (2003) puts it, that although lack of control over an event often goes hand in hand with the event having low chance of happening—or with the event being modally fragile—there are non-lucky events that are either beyond our control—for example, sunrises—or have low chance of occurring—for example, rare significant events brought about by ability. Latus’s hybrid view features a lack of control condition and a subjective probabilistic condition:

H1: A significant event E is lucky for an agent S at time t if only if, (i) just before the occurrence of E at t, S had a low degree of belief that E would occur at t, and (ii) E is beyond S’s control at t.

By contrast, Coffman (2007) and Levy (2011) opt for conjoining a lack of control condition with modal conditions. Coffman’s analysis is roughly the following—he includes several further refinements to handle specific cases of competing significant events:

H2: A significant event E is lucky for an agent S at time t if only if, (i) E does not occur around t in at least half the possible worlds obtainable by making no more than a small change to the actual world at t, and (ii) E is beyond S’s control at t.

Levy’s hybrid analysis (2011) features a different modal condition:

H3: A significant event E is lucky for an agent S at time t if only if, (i) E occurs in the actual world at t but does not occur at t or at times close to t in a large enough proportion of close possible worlds, where the relevant proportion of close possible worlds is inverse to the significance of E for S, and (ii) E is beyond S’s control at t.

Levy calls this kind of luck chancy luck, but argues that there also exists a non-chancy variety of luck, which is the kind of luck that affects one’s psychological traits or dispositions relative to a reference group of individuals—for example, human beings.

Any of the already discussed counterexamples to the necessity for luck of (i) subjective probabilistic conditions—for example, cases of agents without beliefs about events that are lucky for them, (ii) objective probabilistic conditions—for example, cases of highly probable lucky events, (iii) modal conditions—for example, Lackey’s buried treasure case, and (iv) lack of control conditions—for example, Lackey’s demolition case—are troublesome for hybrid views.

7. Luck and Related Concepts

There are several concepts that are closely related to the concept of luck. Here we will focus on the concepts of accident, coincidence, fortune, risk, and indeterminacy.

a. Accidents

The concept of accident is closely related to the concept of luck. After all, most accidents—for example, car crashes—involve luck—mostly bad luck. But as Pritchard (2005: 126) argues, there are paradigmatic cases of luck that involve no accidents. For example, if one self-consciously chooses a specific lottery ticket and wins the lottery, one’s winning is by luck, but it is not an accident given that one was trying to win.

From Pritchard’s example, we might infer that if an agent acts with the intention of bringing about some result, then if it occurs, it is not an accident. However, if someone prays with the intention of bringing about some event and the event occurs by sheer coincidence—because that person’s prayers are causally irrelevant to its occurrence—the event is accidental. But the mere causal relevance of an agent’s actions to an event’s occurrence is not sufficient for excluding accidentality either. If a pilot dancing in the cockpit unintentionally presses the depressurization button and as a result the plane crashes, the crash is an accident despite being caused by the pilot.

This suggests that what prevents the outcomes of an action from being accidental—but not from being lucky—is both the fact that an agent acts with the intention to bring about a certain outcome and the fact that her action is causally relevant to that outcome. For example, if someone wins a lottery in which participants have to pick a ball directly from the lottery drum with a blindfold on, that person’s winning is lucky but not accidental because of being brought about by her direct intentional action.

b. Coincidences

The concept of coincidence is also closely related to the concept of luck. Owens (1992) gives an account according to which a coincidence is an inexplicable event in the following sense: we cannot explain why its constituents come together because they are produced by independent causal factors—see also Riggs (2014) for a similar account. More specifically, coincidences are such that we cannot explain why they occur because there is no common nomological antecedent of their components or a nomological connection between them. For example, if someone prays for rain and it rains, that it rains is a coincidence because there is no nomological connection between that person’s prayers and the fact that it rains. On the other hand, how close or immediate should an antecedent be in order to prevent two events from constituting a coincidence is a matter that usually becomes clear in context. For example, we would regard as a coincidence the fact that someone wishes that her favorite team wins the final and that as a matter of fact it ends up winning the final despite both events have some distant nomological component—for example, the Big Bang; see Riggs (2014) for further discussion.

Not all lucky events are coincidental events. For example, it is no coincidence that a coin lands heads when someone flips it. But that might be clearly lucky for that person. In the same way, as causally relevant intentional action prevents an event from being an accident, causally relevant intentional action seems to prevent a pair of events—someone’s flipping of the coin and the coin landing heads—from being a coincidence. By contrast, all coincidental events, if significant, are lucky. For example, if someone prays for rain because she is in need of water and it rains, the coincidental event that it rains is lucky for that person.

Probabilistic and modal views have difficulties when it comes to accounting for highly probable or modally robust lucky events arising out of coincidence. As Lackey’s buried treasure case illustrates, if the occurrence of the components of a coincidence—A’s burial of the treasure and B’s digging at the same location—is highly probable or modally robust, the occurrence of the resulting coincidental event—B’s discovery of A’s treasure—is also highly probable or modally robust. Yet, the event is lucky precisely because it arises out of a coincidence.

c. Fortune

In the literature, there is some disagreement concerning whether or not the concept of fortune is the same as the concept of luck. Most modal theorists think that luck and fortune are different and use the distinction to argue that Lackey’s buried treasure case is in reality a case of fortune, while their theories are theories of luck.

For example, Pritchard (2005: 144, n.15; 2014) thinks that fortunate events are events beyond our control that count in our favor, but unlike lucky events, they are not chancy or modally fragile. In his way, having good health or a good financial situation are instances of fortune, not of luck, while winning a fair lottery is only an instance of luck. Rescher (1995: 28–9) similarly thinks that we can be fortunate if something good happens to or for us in the natural course of things, but we are lucky only if such eventuality is chancy. In a similar vein, Coffman (2007; 2014) thinks that we are lucky to win a fair lottery—given how unlikely it was—but we are merely unfortunate to lose it—given how likely it was.

Finally, Levy (2009; 2011: 17) thinks that fortunate events are non-chancy events—hence non-lucky—but luck-involving, in the sense that they have luck in their causal history and, in particular, in their proximate causes. His reply to Lackey’s buried treasure case is that luck in the circumstances—the lucky coincidence that someone places a plant at the same location in which someone has buried a treasure—is not inherited by the actions performed in those circumstances or by the events resulting from them—for example, the discovery of the treasure. So while there is luck involved in the circumstances of the discovery, the discovery itself is merely fortunate.

Against the distinction between luck and fortune, Broncano-Berrocal (2015) and Stoutenburg (2015) argue that the terms “luck” and “fortune” can be interchanged in English sentences without any significant semantic difference. Moreover, since English speakers use the terms interchangeably, arguing that luck and fortune are two distinct concepts entails that speakers are systematically mistaken in their usage of the terms, which is a hardly tenable error theory. For example, we would be wrong in saying that someone is fortunate to win a raffle or lucky to win a lottery that, completely unbeknownst to her, has been rigged in her favor.

d. Risk

There is a close connection between the concepts of luck and risk. In fact, some theorists think that the connection is so close that they think that the former can be explained in terms of the latter—see Broncano-Berrocal (2015), Coffman (2007), Pritchard (2014; 2015), and Williamson (2009) for relevant discussion. On the one hand, Pritchard (2015) explains that a risk or a risk event is a potential, unwanted event that is realistically possible—that is, something that could credibly occur—whereas a risky event is a potential, unwanted event that has higher risk than normal of occurring—for example, there is always a risk that one’s plane might crash, but flying by plane is not risky. With that distinction in place, Pritchard distinguishes two competing ways to understand the notion of risk or of risk event.

The probabilistic account of risk says that an event is at risk of occurring just in case there is non-zero objective probability that it will occur. How high its risk of occurrence is—that is, how risky it is—depends on how probable its occurrence is. The modal account of risk, by contrast, says that an event is at risk of occurring just in case it would occur in at least some close possible worlds—see also Coffman (2007) and Williamson (2009). How high its risk of occurrence is—that is, how risky it is—depends on how large the proportion of close possible worlds in which it would occur is—call this the proportion view of degrees of risk—or on how distant possible worlds in which it would occur are—call this the distance view of degrees of risk.

Pritchard contends that the probabilistic account fails to adequately account for degrees of risk. In particular, he argues that if two risk events E1 and E2 have the same probability of occurring but E1 is such that its occurrence is easily possible, E1 is riskier than E2, but the probabilistic account is committed to say that they are equally risky.

Pritchard (2014; 2015) also argues that when risk is understood in modal terms, the notions of luck and risk are basically co-extensive, because both how lucky and risky an event is depends on the modal profile of the event’s occurrence, that is, on the size of the proportion of close possible worlds in which it would not obtain, or the distance to the actual world of possible worlds in which it would not occur. According to Pritchard, the only two minor differences between the two notions are, on the one hand, that risk is typically associated to negative events, whereas luck can be predicated of both negative and positive events; on the other, that while we can talk of very low levels of risk, we cannot so clearly talk of low levels of luck.

Broncano-Berrocal (2015) makes a further distinction between two ways in which we think of risk: the risk that an event has of occurring—or event-relative risk—and the risk at which an agent is with respect to an event—or agent-relative risk. The distinction serves to delimit the scope of Pritchard’s account: his modal account of risk is an account of event-relative risk—the same applies to the probabilistic view. For Broncano-Berrocal, the modal and probabilistic accounts of event-relative risk are both correct: while the probabilistic conception is the one that is typically used or assumed in scientific and technical contexts, the modal conception better fits our everyday thinking about risky events. On the other hand, the best way to understand the agent-relative sense of risk is, according to Broncano-Berrocal, in terms of lack of control: an agent is at risk with respect to the possible occurrence of an event just in case its occurrence is beyond her control. He further argues that the agent-relative sense of risk is the one that really serves to account for luck: when risk is understood in terms of lack of control, the notions of luck and risk are basically co-extensive, because whether an event is lucky or risky for an agent depends on whether it is under the agent’s control.

e. Indeterminacy

In a causally deterministic world, events are necessitated as a matter of natural law by antecedent conditions. It might be thought that lucky events are events whose occurrence was not predetermined in that way. Against this idea, Pritchard (2005: 126–27) argues that at least some lucky events are not brought about by indeterminate factors. For example, given the position and momentum of the balls in a lottery drum at time t1 it might be fully determinate that a certain combination of balls will be the winner combination at t2. To make the point more vivid, Coffman (2007) proposes an example in which someone’s life depends on the fact that a ball remains perfectly balanced on the tip of a cone in a deterministic world. According to Coffman, that person can be properly described as being lucky if her stay in the deterministic world corresponds to the predetermined temporal interval in which the ball would remain balanced on the cone’s tip. Another example is the following: a Laplacian demon, who is able to predict the future given his knowledge of the complete state of a deterministic world at a prior time, might be unlucky to know in advance that he will die in a car accident. The moral of all these cases is that luck is—or at least seems—fully compatible with determinism.

8. References and Further Reading

  • Ballantyne, Nathan 2014. Does luck have a place in epistemology? Synthese 191:1391–1407.
    • Ballantyne argues that investigating the nature of luck does not allow to better understand knowledge.
  • Ballantyne, Nathan. 2012. Luck and interests. Synthese 185: 319–334.
    • Ballantyne provides a detailed examination of the different ways to formulate the significance condition on luck.
  • Baumann, Peter. 2012. No luck with knowledge? On a dogma of epistemology. Philosophy and Phenomenological Research DOI: 10.1111/j.1933-1592.2012.00622.
    • Baumann defends an objective probabilistic condition.
  • Broncano-Berrocal, Fernando. 2015. Luck as risk and the lack of control account of luck. Metaphilosophy 46: 1–25.
    • Broncano-Berrocal proposes a lack of control account and argues that luck can be explained in terms of risk.
  • Coffman, E. J. 2015. Luck: Its nature and significance for human knowledge and agency. Palgrave Macmillan.
    • Coffman’s monograph includes extensive criticism of leading theories of luck and argues that luck can be explained in terms of the notion of stroke of luck; it also explores the applications in epistemology and philosophy of action of that idea.
  • Coffman, E. J. 2014. Strokes of luck. Metaphilosophy 45: 477–508.
    • Coffman proposes an account of strokes of luck.
  • Coffman, E. J. 2009. Does luck exclude control? Australasian Journal of Philosophy 87: 499–504.
    • Coffman defends a specific way to understand the lack of control condition on luck.
  • Coffman, E. J. 2007. Thinking about luck. Synthese 158: 385–398.
    • Coffman gives a hybrid account of luck in terms of easy possibility and lack of control.
  • Church, Ian M. (2013). Getting ‘Lucky’ with Gettier. European Journal of Philosophy. 21: 37–49.
    • Church explores several ways to model degrees of luck in modal terms.
  • Hales, Steven D. 2015. Luck: Its Nature and Significance for Human Knowledge and Responsibility, by E.J. Coffman. The Philosophical Quarterly, DOI:10.1093/pq/pqv093.
    • Critical book review of Coffman’s monograph.
  • Hales, Steven. D. 2014. Why every theory of luck is wrong. Noûs, DOI: 10.1111/nous.12076.
    • Hales gives three kind of counterexamples to probabilistic, modal, and lack of control accounts of luck.
  • Hales, Steven. D. & Johnson, Jennifer Adrienne. 2014. Luck attributions and cognitive Bias. Metaphilosophy 45: 509–528.
    • Hales and Johnson conduct an empirical investigation on luck attributions and suggest that the results might indicate that luck is a cognitive illusion.
  • Lackey, Jennifer. 2008. What luck is not. Australasian Journal of Philosophy 86: 255-67.
    • Lackey argues that the conditions of modal and lack of control analyses are neither sufficient nor necessary for luck.
  • Latus, Andrew. 2003. Constitutive luck. Metaphilosophy 34: 460–475.
    • Latus gives a hybrid account of luck that features subjective probabilistic and lack of control conditions and uses the account to show that the concept of constitutive luck is not incoherent.
  • Levy, Neil. 2011. Hard luck: How luck undermines free will and moral responsibility. Oxford University Press.
    • Levy proposes a hybrid account that conjoins a modal condition with a lack of control condition and argues that the epistemic requirements on control are so demanding that are rarely met; he also applies this account to the free will debate.
  • Levy, Neil. 2009. What, and where, luck is: A response to Jennifer Lackey. Australasian Journal of Philosophy 87: 489–497.
    • Levy defends that Lackey’s buried treasure case poses no problem to modal accounts in terms of the distinction between luck and fortune.
  • McKinnon, Rachel. 2014. You make your own luck. Metaphilosophy 45: 558–577.
    • McKinnon gives an answer to the question of what does it mean to say that someone creates her own luck and uses her account of diachronic luck to explain how we evaluate performances.
  • McKinnon, Rachel. 2013. Getting luck properly under control. Metaphilosophy 44: 496–511.
    • McKinnon proposes an account of diachronic luck in terms of the notion of expected value.
  • Milburn, Joe. 2014. Subject-involving luck. Metaphilosophy 45: 578–593.
    • Milburn distinguishes between subject-relative and subject-involving luck and argues that one of the upshots of focusing on the latter is that lack of control accounts of luck become more attractive.
  • Owens, David. 1992. Causes and coincidences. Cambridge University Press.
    • Owens gives an account of coincidences according to which a coincidence is an event whose constituents are nomologically independent of each other.
  • Pritchard, Duncan (2015). Risk. Metaphilosophy 46: 436–461.
    • Pritchard argues that the standard way of conceptualizing risk in probabilistic terms is flawed and proposes an alternative modal conception.
  • Pritchard, Duncan. 2014. The modal account of luck. Metaphilosophy 45: 594–619.
    • Pritchard defends the modal account of luck from several objections.
  • Pritchard, Duncan. 2005. Epistemic luck. Oxford University Press.
    • Pritchard introduces the modal account of luck and gives corresponding accounts of epistemic and moral luck.
  • Pritchard, Duncan, & Smith, Matthew. 2004. The psychology and philosophy of luck. New Ideas in Psychology 22: 1–28.
    • Pritchard and Smith survey psychological research on luck and argue that it supports the modal account of luck.
  • Pritchard, Duncan, & Whittington, Lee John (eds.). 2015. The philosophy of luck. Wiley-Blackwell.
    • A volume with many of the papers contained in this bibliography.
  • Rescher, Nicholas. 2014. The machinations of luck. Metaphilosophy 45: 620–626.
    • Rescher defends an objective probabilistic account of luck.
  • Rescher, Nicholas. 1995. Luck: The brilliant randomness of everyday life. Farrar, Straus and Giroux.
    • Rescher provides an extensive examination of the concept of luck as well as of many other issues surrounding it.
  • Rescher, Nicholas. 1969. The concept of control. In Essays in Philosophical Analysis. University of Pittsburgh Press: 327–354.
    • Rescher provides an extensive examination of the concept of control.
  • Riggs, Wayne D. 2014. Luck, knowledge, and “mere” coincidence. Metaphilosophy 45 :627–639.
    • Riggs advances an account of coincidence and applies it to the theory of knowledge.
  • Riggs, Wayne. 2009. Knowledge, luck, and control. In Haddock, A., Millar, A. & Pritchard, D. (eds.). Epistemic value. Oxford University Press.
    • Riggs proposes a lack of control account of luck and replies to some objections.
  • Riggs, Wayne 2007. Why epistemologists are so down on their luck. Synthese 158: 329–344.
    • Riggs criticizes the modal account of luck and defends a lack of control condition.
  • Steglich-Petersen, Asbjørn 2010. Luck as an epistemic notion. Synthese 176: 361–377.
    • Steglich-Petersen gives an epistemic analysis of luck in terms of the notion of being in a position to know.
  • Stoutenburg, Gregory. 2015. The epistemic analysis of luck. Episteme, DOI:10.1017/epi.2014.35.
    • Stoutenburg gives an evidential account of degrees of luck.
  • Williamson, Timothy. 2009. Probability and danger. The Amherst Lecture in Philosophy 4: 1–35.
    • Williamson compares probabilistic and modal conceptions of safety and risk and discusses how they bear on the theory of knowledge.

 

Author Information

Fernando Broncano-Berrocal
Email: fernando.broncanoberrocal@kuleuven.be
University of Leuven (KU Leuven)
Belgium

Epistemic Justification

We often believe what we are told by our parents, friends, doctors, and news reporters. We often believe what we see, taste, and smell. We hold beliefs about the past, the present, and the future. Do we have a right to hold any of these beliefs? Are any supported by evidence? Should we continue to hold them, or should we discard some? These questions are evaluative. They ask whether our beliefs meet a standard that renders them fitting, right, or reasonable for us to hold. One prominent standard is epistemic justification.

Very generally, justification is the right standing of an action, person, or attitude with respect to some standard of evaluation. For example, a person’s actions might be justified under the law, or a person might be justified before God.

Epistemic justification (from episteme, the Greek word for knowledge) is the right standing of a person’s beliefs with respect to knowledge, though there is some disagreement about what that means precisely. Some argue that right standing refers to whether the beliefs are more likely to be true. Others argue that it refers to whether they are more likely to be knowledge. Still others argue that it refers to whether those beliefs were formed or are held in a responsible or virtuous manner.

Because of its evaluative role, justification is often used synonymously with rationality. There are, however, many types of rationality, some of which are not about a belief’s epistemic status and some of which are not about beliefs at all. So, while it is intuitive to say a justified belief is a rational belief, it is also intuitive to say that a person is rational for holding a justified belief. This article focuses on theories of epistemic justification and sets aside their relationship to rationality.

In addition to being an evaluative concept, many philosophers hold that justification is normative. Having justified beliefs is better, in some sense, than having unjustified beliefs, and determining whether a belief is justified tells us whether we should, should not, or may believe a proposition. But this normative role is controversial, and some philosophers have rejected it for a more naturalistic, or science-based, role. Naturalistic theories focus less on belief-forming decisions—decisions from a subject’s own perspective—and more on describing, from an objective point of view, the relationship between belief-forming mechanisms and reality.

Regardless of whether justification refers to right belief or responsible belief, or whether it plays a normative or naturalistic role, it is still predominantly regarded as essential for knowledge. This article introduces some of the questions that motivate theories of epistemic justification, explains the goals that a successful theory must accomplish, and surveys the most widely discussed versions of these theories.

Table of Contents

  1. Starting Points
    1. The Dilemma of Inferential Justification
    2. Explaining How Beliefs are Justified
    3. Explaining the Role of Justification
    4. Explaining Why Justification is Valuable
    5. Justification and Knowledge
  2. Internalist Foundationalism
    1. Basic Beliefs
    2. Arguments For and Against Foundationalism
  3. Internalist Coherentism
    1. Varieties of Coherence
    2. Objections to Coherentism
  4. Infinitism
    1. Arguments for Infinitism
    2. Objections to Qualified Infinitism
  5. Types of Internalism and Objections
    1. Accessibilism and Mentalism
    2. Objections to Internalism
  6. The Gettier Era
    1. The History of the Gettier Problem
    2. Responses to the Gettier Problem
  7. Externalist Foundationalism
    1. Externalism, Foundationalism, and the DIJ
    2. Reliabilism
    3. Objections to Externalism
  8. Justification as Virtue
    1. Virtue Reliabilism
    2. Virtue Responsibilism
    3. Objections to Virtue Epistemology
  9. The Value of Justification
    1. The Truth Goal
    2. Alternatives to the Truth Goal
    3. Objections to the Polyvalent View
    4. Rejections of the Truth Goal
  10. Conclusion
  11. References and Further Reading

1. Starting Points

Consider your simplest, most obvious beliefs: the color of the sky, the date of your birth, what chocolate tastes like. Are these beliefs justified for you? What would explain the rightness or fittingness of these beliefs? One prominent account of justification is that a belief is justified for a person only if she has a good reason for holding it. If you were to ask me why I believe the sky is blue and I were to answer that I am just guessing or that my horoscope told me, you would likely not consider either a good reason. In either case, I am not justified in believing the sky is blue, even if it really is blue. However, if I were to say, instead, that I remember seeing the sky as blue or that I am currently seeing that it is blue, you would likely think better of my reason. So, having good reasons is a very natural explanation of how our beliefs are justified.

Further, the possibility that my belief that the sky is blue is not justified, even if it is true that the sky is blue, suggests that justification is more than simply having a true belief. All of my beliefs may be true, but if I obtained them accidentally or by faulty reasoning, then they are not justified for me; if I am seeking knowledge, I have no right to hold them. Further still, true belief may not even be necessary for justification. If I understand Newtonian physics, and if Newton’s arguments seem right to me, and if all contemporary physicists testify that Newtonian physics is true, it is plausible to think that my belief that it is true is justified, even if Einstein will eventually show that Newton and I are wrong. We can imagine this was the situation of many physicists in the late 1700s. If this is right, justification is fallible—it is possible to be justified in believing false propositions. Though some philosophers have, in the past, rejected fallibilism about justification, it is now widely accepted. Having good reasons, it turns out, does not guarantee having true beliefs.

But the idea that justification is a matter of having good reasons faces a serious obstacle. Normally, when we give reasons for a belief, we cite other beliefs. Take, for example, the proposition, “The cat is on the mat.” If you believe it and are asked why, you might offer the following beliefs to support it:

1. I see that the cat is on the mat.

2. Seeing that X implies that X.

Together, these seem to constitute a good reason for believing the proposition:

3. The cat is on the mat.

But does this mean that proposition 3 is epistemically justified for you? Even if the combination of propositions 1 and 2 counts as a good reason to believe 3, proposition 3 is not justified unless both 1 and 2 are also justified. Do we have good reasons for believing 1 and 2? If not, then according to the good reasons account of justification, propositions 1 and 2 are unjustified, which means that 3 is unjustified. If we do have good reasons for believing 1 and 2, do we have good reasons for believing those propositions? How long does our chain of good reasons have to be before even one belief is justified? These questions lead to a classic dilemma.

a. The Dilemma of Inferential Justification

For simplicity, let’s focus on proposition 1: I see that the cat is on the mat.

Horn A: If there are no good reasons to believe proposition 1, then proposition 1 is unjustified, which means 3 is unjustified.

Horn B: If there is a good reason to believe proposition 1, say proposition 1a, then either 1a is unjustified or we need another belief, proposition 1b, to justify 1a. If this process continues infinitely, then 1 is ultimately unjustified, and, therefore, 3 is unjustified.

Either way, proposition 3 is unjustified.

Horn A of the dilemma is the problem of skepticism about justification. If our most obvious beliefs are unjustified, then no belief derived from them is justified; and if no belief is justified, we are left with an extreme form of skepticism. Horn B of the dilemma is called the regress problem. If every reason we offer requires a reason that also requires a reason, and so on, infinitely, then no belief is ultimately justified.

Both of these problems assume that all justification involves inferring beliefs from one or more other beliefs, so let’s call these two problems the dilemma of inferential justification (DIJ). And let’s call the assumption that all justification involves inference from other beliefs the inferential assumption (also called the doxastic assumption, Pollock 1986: 19).

Responses to this dilemma typically take one of two forms. On one hand, we might embrace Horn A, which is, in effect, to adopt skepticism and eschew any further attempts to justify our beliefs. This is the classic route of the Pyrrhonian skeptics, such as Sextus Empiricus, and some later Academic skeptics, such as Arcesilaus. (For more on these views, see Ancient Greek Skepticism.)

On the other hand, we might offer an explanation of how beliefs can be justified in spite of the dilemma. In other words, we might offer an account of epistemic justification that resolves the dilemma, either by constructing a third, less problematic option or by showing that Horn B is not as troublesome as philosophers have traditionally supposed. This non-skeptical route is the majority position and the focus of the remainder of this article.

Philosophers tend to agree that any adequate account of epistemic justification—that is, an account that resolves the dilemma—must do at least three things: (1) explain how a belief comes to be justified for a person, (2) explain what role justification plays in our belief systems, and (3) explain what makes justification valuable in a way that is not merely practically or aesthetically valuable.

b. Explaining How Beliefs are Justified

One of the central aims of theories of epistemic justification is to explain how a person’s beliefs come to be justified in a way that resolves the DIJ. Those who accept the inferential assumption argue either that a belief is justified if it coheres with—that is, stands in mutual support with—the whole set of a person’s beliefs (coherentism) or that an infinite chain of sequentially supported beliefs is not as problematic as philosophers have claimed (infinitism).

Among those who reject the inferential assumption, some argue that justification is grounded in special beliefs, called basic beliefs, that are either obviously true or supported by non-belief states, such as perceptions (foundationalism). Others who reject the inferential assumption argue that justification is either a function of the quality of the mechanisms by which beliefs are formed (externalism) or at least partly a function of certain qualities or virtues of the believer (virtue epistemology).

In addition to resolving the DIJ, theories of justification must explain what it is about forming or holding a belief that justifies it in order to explain how a belief is justified. Some argue that justification is a matter of a person’s mental states: a belief is justified only if a person has conscious access to beliefs and evidence that support it (internalism). Others argue that justification is a matter of a belief’s origin or the mechanisms that produce it: a belief is justified only if it was formed in a way that makes the belief likely to be true (externalism), whether through an appropriate connection with the state of affairs the belief is about or through reliable processes. The former view is called internalism because the justifying reasons—whether beliefs, experiences, testimony, and so forth—are internal mental states, that is, states consciously available to a person. The latter view is called externalism because the justifying states are outside a person’s immediate mental access; they are relationships between a person’s belief states and the states of the world outside the believer’s mental states (see Internalism and Externalism in Epistemology).

c. Explaining the Role of Justification

A second central aim of epistemology is to identify and explain the role that justification plays in our belief-forming behavior. Some argue that justification is required for the practical work of having responsible beliefs. Having certain reasons makes it possible for us to choose well which beliefs to form and hold and which to reject. This is called the guidance model of justification. Some philosophers who accept the guidance model, like René Descartes and W. K. Clifford, pair it with a strongly normative role according to which justification is a matter of fulfilling epistemic obligations. This combination is sometimes called the guidance-deontological model of justification, where “deontology” refers to one’s duties with respect to believing. Other epistemologists reject the guidance and guidance-deontological models for more descriptive models. Justification, according to these philosophers, is simply a feature of our psychology, and though our minds form beliefs more effectively under some circumstances than others, the conditions necessary for forming justified beliefs are outside of our access and control. This objective, naturalistic model of justification has it that our understanding of justification should be informed, in large part, by psychology and cognitive science.

d. Explaining Why Justification is Valuable

A third central aim of theories of justification is to explain why justification is epistemically valuable. Some epistemologists argue that justification is crucial for avoiding error and increasing our store of knowledge. Others argue that knowledge is more complicated than attaining true beliefs in the right way and that part of the value of knowledge is that it makes the knower better off. These philosophers are less interested in the truth-goal in its unqualified sense; they are more interested in intellectual virtues that position a person to be a proficient knower, virtues such as intellectual courage and honesty, openness to new evidence, creativity, and humility. Though justification increases the likelihood of knowledge under some circumstances, we may rarely be in those circumstances or may be unable to recognize when we are; nevertheless, these philosophers suggest, there is a fitting way of believing regardless of whether we are in those circumstances.

A minority of epistemologists reject any connection between justification and knowledge or virtue. Instead, they focus either on whether a belief fits into an objective theory about the world or whether a belief is useful for attaining our many and diverse cognitive goals. An example of the former involves focusing solely on the causal relationship between a person’s beliefs and the world; if knowledge is produced directly by the world, the concept of justification drops out (for example, Alvin Goldman, 1967). Other philosophers, whom we might call relativists and pragmatists, argue that epistemic value is best explained in terms of what most concerns us in practice.

Debates surrounding these three primary aims inspire many others. There are questions about the sources of justification: Is all evidence experiential, or is some non-experiential? Are memory and testimony reliable sources of evidence? And there are additional questions about how justification is established and overturned: How strong does a reason have to be before a belief is justified? What sort of contrary, or defeating, reasons can overturn a belief’s justification? In what follows, we look at the strengths and weaknesses of prominent theories of justification in light of the three aims just outlined, leaving these secondary questions to more detailed studies.

e. Justification and Knowledge

The type of knowledge primarily at issue in discussions of justification is knowledge that a proposition is true, or propositional knowledge. Propositional knowledge stands in contrast with knowledge of how to do something, or practical knowledge. (For more on this distinction, see Knowledge.) Traditionally, three conditions must be met in order for a person to know a proposition—say, “The cat is on the mat.”

First, the proposition must be true; there must actually be a state of affairs expressed by the proposition in order for the proposition to be known. Second, that person must believe the proposition, that is, she must mentally assent to its truth. And third, her belief that the proposition is true must be justified for her. Knowledge, according to this traditional account, is justified true belief (JTB). And though philosophers still largely accept that justification is necessary for knowledge, it turns out to be difficult to explain precisely how justification contributes to knowing.

Historically, philosophers regarded the relationship between justification and knowledge as strong. In Plato’s Meno, Socrates suggests that justification “tethers” true belief “with chains of reasons why” (97A-98A, trans. Holbo and Waring, 2002). This idea of tethering came to mean that justification—when one is genuinely justified—guarantees or significantly increases the likelihood that a belief is true, and, therefore, we can tell directly when we know a proposition. But a series of articles in the 1960s and 1970s demonstrated that this strong view is mistaken; justification, even for true beliefs, can be a matter of luck. For example, imagine the following three things are truth: (1) it is three o’clock, (2) the normally reliable clock on the wall reads three o’clock, and (3) you believe it is three o’clock because the clock on the wall says so. But if the clock is broken, even though you are justified in believing it is three o’clock, you are not justified in a way that constitutes knowledge. You got lucky; you looked at the clock at precisely the time it corresponded with reality, but its correspondence was not due to the clock’s reliability. Therefore, your justified true belief seems not to be an instance of knowledge. This sort of example is characteristic of what I call the Gettier Era (§6). During the Gettier Era, philosophers were pressed to revise or reject the traditional relationship.

In response, some have maintained that the relationship between justification and knowledge is strong, but they modify the concept justification in attempt to avoid lucky true beliefs. Others argue that the relationship is weaker than traditionally supposed—something is needed to increase the likelihood that a belief is knowledge, and justification is part of that, but justification is primarily about responsible belief. Still others argue that whether we can tell we are justified is irrelevant; justification is a truth-conducive relationship between our beliefs and the world, and we need not be able to tell, at least not directly, whether we are justified. The Gettier Era (§6) precipitated a number of changes in the conversation about justification’s relationship to knowledge, and these remain important to contemporary discussions of justification. But before we consider these developments, we address the DIJ.

2. Internalist Foundationalism

One way of resolving the DIJ is to reject the inferential assumption, that is, to reject the claim that all justification involves inference from other beliefs. The most prominent way of doing this while avoiding skepticism is to show that all chains of good inference culminate at a unique kind of belief called a basic belief. Basic beliefs are beliefs that need not be inferred from any other beliefs in order to be justified. This approach to resolving the dilemma is called foundationalism because basic beliefs serve as a foundation on which all other justified beliefs are supported; a person’s beliefs are related to one another like the parts of a building: beliefs justified by inference are analogous to the roof and walls, which are in turn supported by foundational basic beliefs (see Figure 1).

Foundationalism comprises a family of views, all of which claim, at minimum, that all justified beliefs are either basic or inferred from other justified beliefs. Classically, foundationalists combine this view with the claims that we can know whether a belief is justified—that is, whether it stands in an evidential chain that starts with a basic belief—and the claim that knowing whether we are justified helps us fulfill our epistemic duties—in other words, we do well when we form or keep beliefs that are well supported and discard or refuse beliefs that are not; we do poorly when we do not.

The view that justification is a matter of having certain internal mental states is called internalism, and the family of views that include both is called internalist foundationalism. There is a further debate among internalists as to whether justification requires simply having certain mental states (propositional justification) or whether justified beliefs must be based on those mental states (doxastic justification). Philosophers who reject internalism are called externalists (see §7 of this article). Another debate among internalists is whether justification helps us to fulfill epistemic duties—that is, it tells us which beliefs are epistemically permissible, obligatory, or impermissible (the deontological conception of justification)—or whether it is simply a descriptive fact about our belief systems. (For an example of the latter, see Conee and Feldman 2004).

figure 1

Figure 1: Simple Foundationalist Justification. The dots represent beliefs; the arrowsrepresent inferential relations.

a. Basic Beliefs

It is one thing to say basic beliefs resolve the DIJ and quite another thing to explain how they do. René Descartes famously argued that some beliefs are basic because they are indubitable. If a belief is genuinely indubitable, Descartes argued, it cannot be false. As it is commonly understood, dubitability is a psychological, not epistemic, matter. It might be indubitable for me that my mother loves me, even if it is not true and even if it is the sort of belief that could be doubted, even perhaps by me. But Descartes used “indubitable” to describe a belief that is clear and distinct, which is supposed to guarantee that the belief is true. (See Harry Frankfurt, 1973 for a fuller discussion of clarity and distinctness.) Other foundationalists have explained how some beliefs might stop the regress in virtue of self-evidence, or their privileged role in our belief-forming systems, or their incorrigibility.

Long before Descartes, simple mathematical propositions, such as 2 + 2 = 4, and logical propositions, such as “no one is taller than herself,” were thought to be so obvious that they could not be false. These propositions, many claimed, are self-evidently true, that is, they need no supporting evidence because any attempt to support them would be weaker than their intuitive truth. Some philosophers include perceptual experiences among self-evident beliefs, experiences such as seeing red and hearing a ringing sound. Even if you misperceive a color or a sound, or misperceive what seems to be colored or what seems to be ringing, you cannot doubt that you are having the experience of seeing redness or hearing ringing.

Another explanation for why some beliefs are basic is that they play a privileged role in our belief-forming systems. A common example of beliefs privileged in this way are those formed on the basis of sensory perception: seeing a red ball, touching what feels like a rough surface, hearing a bell. You could be hallucinating these experiences, so it is not self-evident that there is a ball, bell, or surface to experience. Nevertheless, the world impresses itself on you in this way, and it would be difficult to imagine functioning without any sense perceptions whatsoever; they play a highly privileged role in our belief systems and, therefore, can justify other beliefs (hence the emphasis that scientists have traditionally placed on observation).

Further candidates for basicality are beliefs that are true in virtue of being believed, that is, if you believe them, they are true. For example, propositions about intentional states (in other words, states about a mental state, such as hoping, doubting, thinking, believing, and so forth), logically imply the existence of the subject who is in the state. So, for anyone who, while thinking, believes the proposition “I think” can logically infer “I exist.” Beliefs that if held are true are called incorrigible. Other examples may include beliefs about introspective states such as what you believe or feel or remember. If incorrigible beliefs can be recognized as true without appeal to any other beliefs, they are good candidates for justifying other, non-basic beliefs.

Unfortunately, it is not easy to see how all of our many and various non-basic justified beliefs can be inferred from this relatively small set of basic beliefs, even if we accepted every type of basic belief just mentioned. For example, imagine you have been looking for your laptop computer. When you find it, you form the belief, “There’s my laptop.” Did seeing your computer elicit the basic belief, “I seem to be perceiving a laptop there,” from which you then inferred the belief “There’s my laptop”? Not obviously. Seeing the laptop allowed you directly—without any reasoning at all—to form the belief that you found your laptop.

Examples like these have motivated some foundationalists to expand their accounts of basic beliefs to include a wider variety of experiences. These weaker accounts allow that there are many types of non-inferentially justified beliefs, all of which are at least properly basic, where “properly basic” means a belief that is either basic in the classic sense or that meets some other condition that makes it non-inferentially justified for a person. As long as there are a sufficient number of properly basic beliefs, these philosophers argue, a certain sort of foundationalism remains plausible.

One example of how proper basicality might work is Alvin Plantinga’s (1983; 1993a) argument for the rationality of religious belief. Plantinga’s notion of proper basicality is supposed to be weak enough to avoid problems with classic basic beliefs but strong enough to avoid the DIJ. According to him, if a belief is properly basic for a person, it is rational for that person to accept it without appealing to other reasons. He uses rational instead of justified to distance himself from classical problems. (Sometimes Plantinga puts it even more weakly, such that, if a belief is properly basic for a person, that person is not irrational in holding it.) As an example, Plantinga argues that if a person is raised in a religious community where the central religious claims he hears are corroborated by the community and none of those claims is undermined by contrary experience or argument, he is not violating any epistemic duty in believing that, say, God exists. His experiences and circumstances can “call forth belief in God” in a way that does not require other beliefs and can serve as a reason to accept other beliefs (1983: 81). This is a controversial view, not least because it either changes the discussion from justification to rationality or conflates justification and rationality. Nevertheless, basic beliefs are controversial no matter how they are characterized, and Plantinga’s proper basicality is just one among several. For another attempt to defend classical foundationalism against objections, see Timothy McGrew (1995).

b. Arguments For and Against Foundationalism

Foundationalism has remained competitive in the history of justification largely because of its intuitive advantages over competing views. The most common argument for foundationalism is the positive argument that it explains how we actually form beliefs on the basis of evidence. I believe the sky is blue because I see that it is blue, not because I infer it from other beliefs about the sky. Roderick Chisholm offers a sophisticated version of this argument, concluding that “[t]hinking and believing provide us with paradigm cases of the directly evident” (1966: 28). In addition to this positive argument, foundationalists offer the negative argument that no alternative account—skepticism, coherentism, or infinitism—has the resources to satisfactorily resolve the DIJ, that is, to avoid both skepticism and an infinite regress (see BonJour and Sosa 2003). This is, perhaps, the more powerful of the arguments and merits some attention.

Skepticism motivated epistemologists to inquire into justification in the first place, so the skeptical option is generally considered a loss. As an alternative, coherentists (§3) maintain that a person’s beliefs are justified in virtue of their relationship to the person’s belief set (see Lehrer 1974). If a belief stands or can stand in a consistent, mutually supportive relationship with other beliefs—a “web of belief,” as W. V. O. Quine (1970) calls it—that belief is justified. However, there is reason to believe that, since all beliefs stand in mutually supportive relationships, at least some beliefs (perhaps all) will play an indispensable role in their own support, rendering any coherentist argument viciously circular. Since circular arguments are fallacious, if coherentism entails that justification is circular, coherentism cannot resolve the DIJ.

A more recent alternative to skepticism is infinitism (see §4), according to which all justified beliefs stand in infinite chains of inferential relations (see Klein 2005). Skepticism is avoided because every belief is justified by some other belief. Unfortunately, infinitism requires that we accept one of two questionable assumptions: either that there simply is an infinite number of justifying beliefs available (and to which our minds, in virtue of being finite, do not have access) or that there is some algorithm that, for any belief, B, can direct us to a non-circular justifying belief for B. The problem with the former assumption is that it seems to depend on faith that there is an infinite series of justifiers, which is not obviously better than having no justification at all. And the problem with the latter is that it comes dangerously close to foundationalism, where the algorithm functions as a basic belief. If the infinitist cannot refute these objections, it cannot resolve the DIJ.

These are simple concerns about coherentism and infinitism, and we consider more sophisticated objections in sections 3 and 4. But, if neither coherentism nor infinitism can provide an alternative means of resolving the original dilemma, foundationalism may be the most promising alternative to skepticism. Unfortunately for foundationalists, even if they are right that some account of basic belief would adequately resolve the dilemma of inferential justification, it is not clear that such an account is currently available. Further, there are at least two other serious objections to foundationalism.

First, there is some concern that foundationalism cannot be justified by its own account of justification, that is, foundationalism is self-defeating. Alvin Plantinga (1993b) offers a version of this objection. According to foundationalists, a belief is justified if and only if it is either basic or inferred from other justified beliefs. This criterion, though, is not itself basic on any classical conception of basic beliefs (indubitability, self-evidence, evident to the senses, or incorrigibility), and it is not clear how it could be supported by other justified beliefs.

One straightforward response to this objection is that the arguments above (the positive argument and the negative argument by elimination), do provide, contra Plantinga, inferential support for foundationalism. In fact, Plantinga (1983; 1993) expands his own notion of proper basicality precisely to avoid the self-defeat objection. Further, if sophisticated reasoning strategies like induction could be justified on foundationalist grounds, then foundationalism itself may be justified on such grounds. For example, Laurence BonJour (1998) defends rational insight as a basic source of evidence and then argues that induction is justified by rational insight. If foundationalism is roughly correct and there are arguments grounded in rational insight that justify foundationalism, foundationalism might be vindicated. Of course, there remain concerns about the circularity of such arguments.

Other philosophers use an inference to the best explanation to defend a type of basic evidence, though these views may rightly be regarded as hybrids of foundationalism and coherentism. For example, Earl Conee and Richard Feldman (2008) argue that “[p]erceptual experiences can contribute toward the justification of propositions about the world when the propositions are part of the best explanation of those experiences that is available to the person.” The idea that what have been called basic beliefs are connected with the world and how we are positioned in the world is a better explanation of why we have the evidence we have than traditional accounts of justification. Catherine Z. Elgin (2005) offers a similar account, arguing that, while perceptions have “initial tenability” given their privileged role in our belief formation, they do not obtain this tenability in isolation from our whole evidential context; over time, certain perceptual beliefs have proved themselves to have the plausibility that allows us to privilege them.

A second objection to foundationalism is the meta-justification argument. The idea is that basic beliefs cannot resolve the DIJ because, even if their justification does not depend on other beliefs, it does depend on reasons which themselves require reasons. If I believe a proposition because it is indubitable, then I must have some reason for thinking that indubitable beliefs are likely to be true. If I do not, I am stuck with Horn A, and if I do, I am stuck with Horn B. To demonstrate this problem, Peter Klein (2005) asks us to imagine an argument between Fred and Doris, where Fred has come to what he regards as the basic belief on which his argument depends; call it b.

According to Fred, b has autonomous justification, that is, is a type of basic belief. Doris happens to agree that b is autonomously justified but asks whether beliefs with autonomous warrant are likely to be true. As a foundationalist, the most plausible option for Fred is the following: “He can hold that autonomously warranted propositions are somewhat likely to be true in virtue of the fact that they are autonomously warranted” (2005: 133).

If Fred is right, however, b only works as a justification for the rest of his argument precisely because he has added something to b. What has he added? Namely, that he “has a very good reason for believing b, namely b has F and propositions with F are likely to be true.” These are propositions independent of b that serve to justify b. Klein continues: “Of course Fred, now, could be asked to produce his reasons for thinking that b has F and that basic propositions are somewhat likely to be true in virtue of possessing feature F” (2005: 134). If this is right, basic beliefs do not stop the regress of reasons (see also Smithies 2014).

One response to this criticism comes from Laurence BonJour, who argues that it is plausible to think that understanding b includes a sort of built-in awareness of the content of those additional premises Klein mentions, such that understanding b constitutes, in and of itself, a reason to hold b (BonJour and Sosa 2003: 60-68). If it is possible to have an evidential state that includes, non-inferentially, all the content necessary for having a reason to believe a proposition is true, foundationalists may be able to describe a basic belief that stops the regress and avoids skepticism. But explaining just what this state is remains a point of controversy.

Another response is to construct an inference to the best explanation, as mentioned above in response to the self-defeat objection (Elgin, 2005; Conee and Feldman, 2008). The result, again, is typically a hybrid view, which may be equivalent to giving up foundationalism. Conee and Feldman say their view is closer to a “non-traditional version of coherentism” (2008: 98). And Elgin calls her view “a very weak foundationalism or…a coherence theory” (2005: 166). This raises questions about the merits of coherentism, to which we now turn.

3. Internalist Coherentism

Like foundationalists, coherentists attempt to avoid skepticism while rejecting infinitism. But they find a further problem with foundationalism. Every sensory state (seeing red, smelling cinnamon, and so forth) must be understood in a mental context, that is, one must have a set of background experiences, beliefs, and vocabulary sufficiently large for forming and understanding beliefs. All sensory beliefs, such as “I see red” and “I smell cinnamon,” require an immensely complex set of assumptions about self-reference, seeing, colors, smelling, and scents. This means that individual beliefs are not isolated bits of information that act as bricks in a building; they are nodes of information that depend for their meaning and support on a web of relationships with other beliefs.

Many coherentists accept the inferential assumption and argue that the result is not an infinite regress of inferences, but a non-linear system of support from which justification emerges as a property of the combination of inferences. As Donald Davidson puts it, “[N]othing can count as a reason for holding a belief except another belief” (2000: 156). Other coherentists reject the inferential assumption and argue that the result is a non-linear system of support from which justification emerges as a property of the set as a whole. Keith Lehrer explains: “This does not make the belief self-justified, however, even though it might be non-inferential. The belief is not justified independently of relations to other beliefs. It is justified because of the way it coheres with other beliefs belonging to a system of beliefs” (1990: 89). As we see below (§3.b), some coherentists reject the belief requirement of the inferential assumption, arguing that perceptual experiences can play a justifying role in the set of mental states that includes a person’s beliefs.

Regardless of whether coherentists accept the inferential assumption, they can allow that some beliefs are non-inferentially generated—for example, by experiences, intuitions, hunches, and so forth. But they are committed to the idea that the justification for beliefs generated in these ways depends essentially on their relationship to the person’s complete set of beliefs. Construed in this way, coherentism is specifically a view about justification and should not be confused with coherentism about a truth. Some philosophers have held both coherentism about truth and justification (Blanshard 1939 and Lewis 1946), but many who hold coherentism about justification reject coherentism about truth (see BonJour 1985, ch. 5, and Truth).

a. Varieties of Coherence

Broadly, coherentists argue that a belief is justified just in case it stands in a system of mutually supporting relationships with other beliefs in a person’s system of beliefs. For instance, my belief that the cat is on the mat involves a complicated set of beliefs: I am seeing a cat, I am seeing a mat, I am seeing a cat on a mat, a cat is a particular kind of mammal, a mat is a particular type of floor covering, my vision is generally reliable under normal circumstances, these are normal circumstances, and so forth. It is difficult to imagine arranging these in a linear, foundationalist fashion. In addition, it is not clear whether some of these beliefs are more basic than some others. Nevertheless, they all cohere, which means they are logically consistent with one another and with other beliefs in my belief set, and they mutually support one another. The challenge for coherentists is to explain just what “mutual support” amounts to.

Whereas foundationalists employ the metaphor of a building (or a pyramid, in some cases) to explain justificational relationships, coherentists employ the metaphor of a web (or, in some cases, a raft), according to which, each node (or plank) works alongside the others in a non-linear fashion to constitute a stable, interconnected whole (see Figure 2, as well as Neurath 1932, Quine 1970, and Sosa 1980). There are four candidates for how the web or raft holds together: logical consistency, logical entailment, inductive probability, and explanation.

figure 2

      Figure 2: Simple Coherentist Justification

                P-S represent propositions;

         the arrows represent lines of inference.

The first candidate, logical consistency, is generally regarded as necessary for coherence but too weak to stand on its own. For example, the belief that P and the belief that probably not-P are logically consistent. But they are not coherent; if one of them is true, the other is not likely to be true (BonJour 1985, ch. 5). Therefore, some early coherentists added that the relationship must also include logical entailment. This view, which I will call entailment coherentism, has it that a belief is justified just in case it entails or is entailed by every other belief in a person’s belief set (Blanshard 1939). Most coherentists now reject this relationship as overly strict, primarily because it seems possible to have two very different beliefs, neither of which entails the other and yet which are both justified. For example, consider the beliefs “I am seeing a needle puncture my skin” and “I am feeling pain.” Neither belief entails the other; nevertheless, it is intuitively plausible that both belong to a coherent set of beliefs.

Because of the problems with mere consistency and consistency plus entailment, most coherentists allow that entailment is sufficient for coherence but not necessary. To capture weaker relationships, they expand the notion to include inductive probability. Inductive probability coherentism is the view that a belief is justified just in case it is a member of a set each of whose members is entailed by or made more probable by a subset of the rest. C. I. Lewis, calling this type of justification “congruence,” puts it eloquently: “A set of statements, or a set of supposed facts asserted, will be said to be congruent if and only if they are so related that the antecedent probability of any one of them will be increased if the remainder of the set can be assumed as given premises” (1962). With their emphasis on inferential relations among beliefs, entailment and inductive probability coherentism attempt to resolve the DIJ by capturing the intuitive plausibility of the inferential assumption while avoiding the difficulties with basic beliefs.

Unfortunately, inductive probability coherentism faces problems similar to those that face entailment coherentism. It seems plausible for a person to hold two justified beliefs without the antecedent probability of either increasing the epistemic probability of the other, even when conjoined with other beliefs in the set. Consider, for example, your beliefs that “the Red Sox will win the Pennant” and “John F. Kennedy was shot in 1963.” Both beliefs are reasonably part of a person’s belief system, and yet it is difficult to see how one might contribute to a set of beliefs that makes the other more probable. Second, even if a subset of beliefs in a set increase the probability of each other member, the set might not be sufficiently comprehensive or well-connected with one’s experiences to justify one’s beliefs. Imagine a set of 100 beliefs, any 99 of which render the 100th member more probable than its antecedent probability. This set passes the inductive probability test and is, therefore, coherent on this account, but it includes very few beliefs. This suggests that, in order to maintain coherence, we could arbitrarily expand or contract our set of beliefs at will to avoid loss of rationality. The only guideline is that we preserve strong inductive inferences. Unfortunately, such arbitrary sets ignore important differences in the sources of beliefs; we can imagine two inductively coherent sets, one that includes sensory beliefs and one that does not. Inductive probability coherentism, without further qualification, implies that neither set is more rational than the other. As Catherine Z. Elgin puts it, “A good nineteenth-century novel is highly coherent, but not credible on that account. Even though Middlemarch is far more coherent than our regrettably fragmentary and disjointed views…, the best explanation of its coherence lies in the novelist’s craft, not in the truth…of the story” (2005: 159-60).

A third prominent account of coherence aimed at avoiding this criticism allows that entailment and inductive probability can contribute to coherence but only insofar as they function in a plausible explanation of the set of beliefs. According to this view, known as explanatory coherentism, beliefs are justified just in case they explain or are explained by the other beliefs of the same type (Harman 1986 and Poston 2014). This view is not committed to the inferential assumption and argues that justification is an emergent property of the explanatory relations among beliefs. Catherine Z. Elgin says that “epistemic justification is primarily a property of a suitably comprehensive, coherent account, when the best explanation of coherence is that the account is at least roughly true” (2005: 158). Elgin adds that the beliefs comprising a coherent system “must be mutually consistent, cotenable, and supportive. That is, the components must be reasonable in light of one another” (2005: 158).

Explanatory coherentism takes its motivation from responses to a problem in philosophy of science that was similar to the problem that faces inductive probability coherentism (Neurath 1932 and Hempel 1935). Not every proposition in a scientific theory is derived inferentially from others, and so there is some question as to whether such propositions could be believed justifiably. It turns out, though, that those propositions play an important explanatory role in the theory that organizes evidence and concepts in plausible ways, even if those propositions have no antecedent probability outside of the system. Elgin explains, “For example, although there is no direct evidence of positrons, symmetry considerations show that a physical theory that eschewed them would be significantly less coherent than one that acknowledged them. So physicists’ commitment to positrons is epistemically appropriate” (2005: 164). This suggests that explanations can play a justifying role independently of inferential relations, thus lending plausibility to coherentism.

Explanatory coherence avoids criticisms of earlier accounts in that it (1) maintains that consistency is an important constraint on a belief set, and (2) maintains that inferential relations contribute to explanatory power, while (3) also accounting for the intuitive connection of certain beliefs with sensory evidence and non-inferential coherence relations. Nevertheless, some criticisms have led philosophers like BonJour (1985), Lehrer (1974; 1990), and Poston (2014) to add other interesting and influential conditions to coherence theories, though space prevents us from exploring them here.

b. Objections to Coherentism

There are three prominent objections to coherentism. The first, which we already encountered in §2.b, is called the circularity problem. Since coherentism depends on mutual support relations, every particular belief will likely play an essential role in its own justification, rendering coherentist justification a form of circular argument (see Figure 3).

figure 3

The problem with circular justification is that it putatively undermines the goal of justification, which is to garner support for claim. If a claim is inferred from itself (P à P), the concluding proposition has only as much support as the premise, but that is precisely what we do not know. Therefore, multiplying the inferences between a proposition and an inference to that proposition (for example, (P à Q); (Q à R); (R à S); (S à P)) cannot justify P.

In response, some coherentists argue that the circularity objection oversimplifies the view. While it is true that a belief will almost certainly play a role in its own justification, this is only problematic if we assume the justificational relationship is linear. Properly understood, justification is a property that emerges from non-linear relationships among beliefs, whether inferential or non-inferential. For example, Catharine Z. Elgin tells a story about Meg (adapted from a story by Lewis 1946), whose logic textbook was stolen. There were three witnesses to the theft, but all are unreliable witnesses (one is aloof, one has severe vision problems, and one is a known liar). Nevertheless, all three witnesses agree that the thief had spiked green hair. Despite the fact that no one of the witnesses is reliable, their independent testimony to a single, unique proposition increases the likelihood that the proposition is true. As Elgin puts it, “This [agreement] makes a difference. … Their accord evidently enhances the epistemic standing of the individual reports” (2005: 157). If this is right, the antecedently low probability of the thief’s having spiked green hair can be added to the combined strength of the testimonies to create a justified belief without vicious circularity.

A second objection to coherentism is called the isolation objection. Even if a collection of beliefs could explain, and thereby justify, its members, it is not obvious how this set of beliefs is connected with reality, that is, with the content the beliefs are about. In rejecting basic beliefs, coherentists reject privileging any particular cognitive state in the belief system, such as sensory experiences. All beliefs are treated equally and are evaluated according to whether they cohere with the belief set. But beliefs can cohere with one another regardless of whether their content expresses true propositions about reality. Coherence cannot guarantee that the set is not isolated from reality.

Some coherentists respond to this objection by making special provisions for beliefs that derive from coherence-increasing sources, such as sense experience. (BonJour (1985) calls such beliefs “cognitively spontaneous beliefs.”) This makes the degree of coherence partly a matter of how well the system of beliefs integrates sense perception. Others appeal to more abstract distinctions among types of justification. For example, Keith Lehrer (1986) distinguishes personal justification, which involves the traditional, internalist coherence requirement, from verific justification, which is an externalist requirement on coherence. While objective coherence may be outside a person’s ken, it nevertheless contributes, along with personal justification, to what Lehrer calls complete justification. This externalist requirement helps to ground a person’s system of beliefs in the world those beliefs are supposed to be about.

Another coherentist response to the isolation objection is to allow experience itself, not just beliefs about experience, to figure in the evaluation of coherence. Catherine Z. Elgin (2005) argues that we have good reasons to privilege some perceptual experiences over very coherent sets of beliefs. She argues that this is because perception does not—contra foundationalists—work in isolation from other sorts of evidence. She says, “Only observations we have reason to trust have the power to unseat theories. So it is not an observation in isolation, but an observation backed by reasons that actually discredits the theory” (162). This also explains how we are able to privilege some perceptual experiences over others (say, in unfavorable conditions), though she admits that her view includes “something other than coherence,” and allows that it is a very weak form of foundationalism. For a reply along these lines that maintains a more traditional version of coherentism, see Kvanvig and Riggs (1992).

A third objection is called the plurality objection. Because justification is determined solely by the internal coherence of a person’s beliefs, coherence theory cannot guarantee that there is “one uniquely justified system of beliefs” (BonJour 1985: 107). BonJour explains that this is because “on any plausible conception of coherence, there will always be many, probably infinitely many, different and incompatible systems of belief which are equally coherent” (ibid.). To show just how pernicious this problem is, Lehrer asks us to imagine one set of beliefs comprised of both necessary and contingent beliefs and then to imagine a second set created by negating all the contingent beliefs in the first set (1990: 90). This has the nasty implication that, if coherence is sufficient for justification, then “for any contingent statement a person is completely justified in accepting is such that he is also completely justified in accepting the denial of that statement” (ibid.).

One response to the plurality objection is to invoke a “total evidence” requirement on explanatory and probabilistic relations. While we can arbitrarily construct probabilistically and explanatorily coherent sets, there is a non-trivial sense in which non-belief states explain our beliefs: sensation, testimony, and so forth. A theory of explanation that includes the antecedent probabilities of the beliefs based on this evidence would be more coherent with our total evidence than an arbitrary set of beliefs that ignores them. Recent debates over the relationship between coherence and truth include sophisticated analyses of probabilistic assessments (Klein and Warfield 1994 and Fitelson 2003) and an interesting argument for the impossibility of coherence’s increasing the probability that a belief is true (Olsson 2009), but there is not space to develop these arguments here.

For more on coherentism, see Coherentism in Epistemology.

4. Infinitism

Infinitism is an internalist view that proposes to resolve the dilemma of inferential justification by showing that Horn B of the DIJ, properly construed, is an acceptable option. In fact, argue infinitists, there are no serious problems with an infinite chain of justifying beliefs.

Traditionally, epistemologists have rejected the idea that a belief’s linear chain of justifying beliefs can extend infinitely because it leaves all beliefs ultimately unjustified. Inferential justification is said to transmit justification, not create it; therefore, an infinite chain of justifying beliefs would have no source of support to transmit. Similarly, since one could not hold an infinite number of beliefs or mentally trace an infinitely long chain of beliefs, infinitism betrays a common internalist intuition that a person must be aware of good reasons for holding a belief.

Infinitists claim these criticisms are misguided. In practice, justification is not as tidy as epistemologists would have us believe. The traditional idea that the regress must stop or bottom out in basic beliefs is unrealistic and unnecessary. Few of us attempt to draw inferences long enough to arrive at basic beliefs. We often stop looking for reasons when we are content that we have fulfilled our epistemic responsibility, not because the chain has actually ended (Aikin 2011). Foundationalists and coherentists, then, are relatively unconcerned with ultimate justification in their own epistemic behavior and, therefore, to hold epistemic justification to such high standards renders very few of our beliefs justified. To accommodate this messiness, infinitists might reject the inferential assumption, at least as classically understood. Like coherentists, infinitists may hold that justification is an emergent property of a set of beliefs and that justification comes in degrees such that, the longer the inferential chain, the stronger the degree of justification (Klein 2005).

a. Arguments for Infinitism

There are two main lines of argument for infinitism. The first is that foundationalism and coherentism cannot stop the structure of justification from regressing infinitely. For example, Peter Klein (2005) constructs a version of the meta-justification argument against foundationalism and argues that the most plausible version of coherentism (emergent justification accounts), because of its appeal to a basic assumption about the reliability of coherent sets, is merely a disguised form of foundationalism. If these arguments hit their mark, and if externalism is ruled out, infinitism may be the only non-skeptical option available.

The second main line of argument for infinitism is that the classic objections to infinitism are aimed at overly simplistic versions of the view; they do not threaten suitably qualified versions. For example, Scott Aikin (2009) argues that concerns about the regress arise because of a conflict between two types of intuition: (1) proceduralism, which includes our standard intuitions about good reasons and responsible believing, and (2) egalitarianism, which includes our intuitions that people are generally justified in believing a lot of things (beliefs about how to set DVRs and beliefs about how to get from home to work). Aikin claims that infinitists take the demands of proceduralism more seriously than egalitarian intuitions, maintaining that justification and knowledge are very difficult to attain. The more committed we are to following our chains of evidence, the more likely we are to attain our epistemic goals. However, we often stop far from what even foundationalists would take to be the end of those chains. And at every proposed stopping point, there is an infinite number of justificational questions about the appropriateness of the terms we are using, the reliability of our perceptions and concept attributions, and so forth. If this is right, infinitism may be the most plausible implication of our epistemic intuitions.

Similarly, Peter Klein (2014) argues that infinitism is a minimal thesis about what makes justification valuable, namely, that it renders our beliefs “reason-enhanced.” He says, “Infinitism holds that a belief-state is reason-enhanced whenever S deploys a reason for believing that p. Importantly, S can make a belief-state reason-enhanced even if the basis is another belief-state that is not (yet) reason-enhanced” (2014: 105). If this is right, then the process of inferring can create or produce original epistemic support, and we need not appeal to anything like basic beliefs for ultimate support. Further, infinitists do not object to a chain of inference’s stopping, for instance, when some presuppositions are explicit. For example, reasoning about Euclidean geometry may appropriately stop at Euclid’s axioms when we agree that they are our standard of evaluation. But we can also admit that those axioms can be challenged, and our reasoning could continue indefinitely. Infinitists simply argue that this is a standard feature of all justification.

b. Objections to Qualified Infinitism

Carl Ginet (2005) argues that even qualified infinitism is motivated on spurious grounds. One argument against foundationalism is that, even for basic beliefs, one needs a reason to believe they are true, and this initiates an infinite regress of reasons. Ginet objects, however, that this argument threatens foundationalism only if all reasons are inferential reasons. Of course, this is precisely what foundationalists reject. If some non-belief reasons are justified independently of any additional reasons for thinking they are true, that is, if they are inherently reasonable, the infinitist argument against foundationalism is question-begging.

In response, the infinitist might contend that, even if its critique of foundationalism is flawed, infinitism may yet be the more plausible alternative. If infinitism captures our intuitions about justification as adequately as foundationalism, and if it requires fewer controversial concepts (basic beliefs), infinitism may be an attractive competitor.

Another objection to infinitism is that, given our finite minds, we lack complete access to the infinite set of justifying beliefs. If a person has no access to his reason for belief, then infinitism is no longer internalist and, thereby, loses its means of defusing the DIJ. Of course, the infinitist may concede this and fall back on a mentalist account of epistemic access (see §5.a below). As Ginet puts it: a belief (L) “is available to S as a reason for so believing only if S is disposed, upon entertaining and accepting (L), to believe that the fact that (L) was among his reasons for so believing” (2005: 146). If this is right, a person may have a disposition to recognize further evidence for his justifying beliefs when prompted to do so.

Nevertheless, even this mentalist-enhanced infinitism faces the concern that the process of justification is never complete. An assumption behind the DIJ is that, if for any belief, there is not a reason to believe it is true, that belief and any beliefs inferred from it are unjustified. If this is right, and the justification condition for infinitism is never actually met, then we are left with skepticism.

A variation on this criticism is the idea that inferential justification can only transmit justification and cannot originate it. The idea is that all inference is conditional ((P → Q); (Q → R); (R → S)). Given this set of propositions, is S justified for us? That depends on whether P is justified. Telling us that P is justified by N, (N → P), though, does not answer the question of whether S is justified. We still need to know whether N is justified (Dancy 1985: 55). If this is right, then no matter how long the chain of inference is—even if it is infinite—no belief is justified.

Infinitists may respond to this objection by arguing that the justification condition is not a matter of getting to a final, infinitely large set, but of increasing one’s epistemic reasons for the proposition in question. Peter Klein, using the term “warrant” for “justification,” says that infinitism is like coherentism in this respect. He says, “Infinitism is like the warrant-emergent form of coherentism because it holds that warrant for a questioned proposition emerges as the proposition becomes embedded in a set of propositions” (2005: 135). Further, Klein explains that “warrant increases not because we are getting closer to a basic proposition but rather because we are getting further from the questioned proposition” (137). This amounts to a rejection of the claim that inferential justification can only transmit justification and, therefore, that a justificational chain must be complete in order to be adequate (recall Catherine Z. Elgin’s story about Meg in §3.b above).

A worry for this response is similar to a worry for coherentism. Any criterion that implies the infinite set of beliefs is justified is either part of the set or independent of it, in which case, it, too, needs a justification. If some sort of justification-conferring awareness is built into the increasingly large set, infinitism seems like foundationalism in disguise.

A further worry is that, if infinitists do not require that a person actually have an infinite number of justifying beliefs or perform an infinite number of inferences, then infinitism seems committed to the idea that inference itself can create justification. This, however, seems implausible. Carl Ginet writes, “…acceptable inference preserves justification … [but] there is nothing in the inferential relation itself that contributes to making any of those beliefs justified” (2005: 148-49). If inference cannot produce justification, it is unclear how a belief in an infinite chain of inferences comes to be justified.

For a more detailed treatment of infinitism, see Infinitism in Epistemology.

5. Types of Internalism and Objections

As noted above (§2), the view that justification is something we can determine by directly consulting our mental states is called internalism. This view does not entail that all epistemic concepts are internal. John Greco gives an example to demonstrate the difference: “[S]uppose that someone learns the history of his country from unreliable testimony. Although the person has every reason to believe the books that he reads and the people that teach him, his understanding of history is in fact the result of systematic lies and other sorts of deception” (2005: 259). Objectively speaking, this person’s beliefs are not reliably connected with reality. Subjectively, though, he is following his evidence to their rational conclusion. Should we say this person’s beliefs are justified? Since the reliability of his sources is beyond his ability to evaluate, the internalist says he has fulfilled his epistemic duty: yes, he is justified.

For centuries, there was no serious alternative to internalism. As we will see in §6, the advent of the Gettier case in the 20th century constitutes a serious challenge to internalism, and it contributed to alternative, externalist accounts of knowledge and justification. This move to externalism also led to closer scrutiny of internalism, and new concerns about its adequacy arose. I review just two of these here. But before doing so, it is helpful to distinguish two types of internalism: accessibilism and mentalism.

a. Accessibilism and Mentalism

According to accessibilists, in order for a belief to be justified for a person, that person must have “reflective access” to good reasons for holding that belief. To have reflective access is to be directly mentally aware of reasons for holding a belief. Some accessibilists argue that a person’s access must be occurrent, that is, she must be currently aware of her reasons for holding a belief (Conee and Feldman 2004). Others hold the looser requirement that, as long as a person has had direct access to relevant justifying reason, she is justified in holding the supported belief.

According to mentalists, reflective access may be sufficient for justification, but it is not necessary. All that is necessary for a belief to be justified is that a person has mental states that justify the belief, regardless of whether a person has reflective access to those states. Mentalists allow that some non-reflectively accessible mental states can justify beliefs.

Mentalism is supposed to have several advantages over accessibilism given the standard criticisms of internalism. For example, some have objected to internalism on the grounds that it cannot accommodate intuitive cases of stored or forgotten evidence. If, for example, you are driving and not thinking about whether Washington, D.C. is the capital of the United States, or you have forgotten any evidence for this belief, are you justified in believing that it is? If not, could we say that you know it is the capital? Accessibilists claim that a person must be able to access her evidence for a belief while she is currently thinking about it and presumably without prompting. Few of us, though, hold (or even could hold) a belief with all its attendant reasons in mind at once. Similarly, it seems reasonable to imagine that a person is justified in believing a proposition for which she has forgotten her evidence. Mentalists can handle these cases by claiming that the ability to access stored facts can constitute dispositional justification, and that even in cases of forgotten evidence, it could still be the case that the fact that it is justified is consciously available, either occurrently or dispositionally (Conee and Feldman 2004).

The worry for mentalism is that, in allowing non-occurrent mental states to count as reasons, mentalism betrays its claim to be internalist. For example, there may be a lot of evidence I could have that P is true if I were in the right place at the right time. But the existence of that evidence does not obviously justify P for me since being in such a place might be a matter of luck. Being at the right place at the right time may mean that the evidence that, say, “Washington is the capital,” is in a book nearby that I never happen to read or that the evidence is one of my mental states that I am not currently thinking about, even if I could when prompted. Specifying just what it means for evidence to be available but not occurrent turns out to be quite difficult. Richard Feldman (1988) argues that in neither of these examples am I justified in believing that Washington is the capital and that a mental state counts as evidence if and only if one is currently thinking of P. Feldman embraces the counterintuitive implication that “one does not know things such as that Washington is the capital when one is not thinking of them” (237). Despite these difficulties, the distinction between accessibilism and mentalism plays an important role in the debate over internalism.

For more on accessibilism and mentalism, see §1.c of, Internalism and Externalism in Epistemology.

b. Objections to Internalism

In addition to the Gettier problem (§6), there many other lines of argument that challenge internalism. Here, I review only three. One of these lines is called the access problem. Traditional foundationalists have accepted some version of accessibilism. For example, Roderick Chisholm writes that justification is “internal and immediate in that one can find out directly, by reflection, what one is justified in believing at any time” (1989: 7). But what if the belief P that justifies my current belief Q is tucked far back in the recesses of my memory and would require more time than I currently have to access it? Am I still justified in believing Q? Or worse, imagine that I have forgotten P; there is no possibility that I can directly access it. However, Q seems true to me, I remember that I had good reasons for believing it, and I do not have any reasons to doubt Q now. Am I justified in believing Q in this case?

Without some modification, the internalist must say no in both cases—the relevant evidence is neither immediately nor reflectively available—though intuitively these are normal cases of justified belief. The standard response is two-fold. First, we must admit that justification comes in degrees: having more evidence can increase one’s justification and some evidence is stronger than others. And second, the state of seeming to be justified or remembering that I am justified can, themselves, constitute reasons for belief. Therefore, in these cases, the internalist might respond that, while the justifications are not as strong as we would prefer, they are, nonetheless, based on accessible mental states.

A second, related objection to internalism is what, following John Greco, I will call the etiology problem. Internalism tends to make justification so easy that it is unclear how one is able to distinguish between good and bad reasons. Consider an example from Greco (2005):

Charlie is a wishful thinker and believes that he is about to arrive at his destination on time. He has good reasons for believing this, including his memory of train schedules, maps, the correct time at departure and at various stops, etc. However, none of these things is behind his belief—he does not believe what he does because he has these reasons. Rather, it is his wishful thinking that causes his belief. Accordingly, he would believe that he is about arrive on time even if he were not. (261)

Why is the combination of his beliefs about schedules, maps, and time a better reason for thinking he is about to arrive than wishful thinking? Presumably, it is because those things are reliable indicators of truth, whereas wishful thinking is not. Being a reliable indicator of truth, though, is an external relationship between the belief and the world—something to which Charlie has no access. We can arrive at a similar result from imagining that Charlie does base his beliefs on his beliefs about train schedules, and so forth, but stipulating that he formed those beliefs carelessly and haphazardly, and only accidentally arrived at the correct conclusion. Nevertheless, based on these beliefs, it seems clear to Charlie that the conclusion follows.

An internalist might respond that this objection depends on the mistaken assumption that internal factors exclude empirical evidence. To see how this assumption slips in, consider how an externalist might determine that train schedules are more fitting sources of evidence than wishful thinking. Presumably, externalists would evaluate the past track record of each source of evidence to see which more reliably indicates truth. The act of “reviewing their past track records,” however, involves appealing to internal states about what seems to be their track records and, therefore, is not obviously different from what an internalist would do; one has internal access to evidence that train arrivals correspond more reliably with train schedules than with wishes. By demanding that justification depends only on external features of the belief-forming process, and then appealing to internal features to evaluate external reliability, Greco is not denying that one must have good, accessible reasons for her beliefs; he is simply disguising the internal features by including them in the external conditions (Feldman 2005: 281). Therefore, either objective etiology is essential to justification, and, therefore, since no one has access to it, we are left with skepticism, or subjective access to evidence of reliable etiology is sufficient for justification, and the externalist criticism misses its mark.

Both the access problem and the etiology problem challenge the idea that we can determine whether we are justified by appeal to internal states. But even if this challenge can be answered, internalism is sometimes thought to imply that we can voluntarily control or change what we believe, that is, that we are guided but not determined by our evidence. The view that we have voluntary control over what we believe is called doxastic voluntarism (from the Greek doxa, for “what is given” and sometimes for “what is believed”). The idea is that internalism is intuitive partially because it allows us to take responsibility for our epistemic behavior. In fact, “[n]onvoluntarism is generally taken to rule out responsibility, since one is not responsible for what one does not control” (Adler 2002: 64). Taking responsibility implies we can decide to respond to evidence well or poorly. This suggests a third objection to internalism called the guidance problem. (For presentations of the guidance problem, see John Heil 1983 and William Alston 1989.)

It turns out that it is difficult to control what we believe: try to make yourself believe you are not reading this page or that you are not real. It is unclear what it would take to convince you that such things are true. That kind of shift would seem to require a complete change in your evidence. But if that is right, then our beliefs are tied strongly to factors outside our control; we cannot simply decide what evidence we have or whether to believe on the basis of that evidence. According to this critique, the idea that internalism explains how we take responsibility for our beliefs is misguided.

In response, contemporary internalists tend to accept that our beliefs are largely determined by the evidence we perceive ourselves to have, but they reject the idea that complete or even partial voluntary control is necessary for responsibility. Carl Ginet (2001) argues that our control over our beliefs is limited but that we nonetheless may decide what to believe in those cases where the evidence is indecisive, cases “where the subject has it open to her also to not come to believe it” (74). Further, Earl Conee and Richard Feldman (2004) argue that a person’s beliefs may appropriately fit one’s evidence even if she cannot control whether she forms those beliefs. For instance:

Suppose that a person spontaneously and involuntarily believes that the lights are on in the room, as a result of the familiar sort of completely convincing perceptual evidence. This belief is clearly justified, whether or not the person cannot voluntarily acquire, lose, or modify the cognitive process that led to the belief. (85)

For a more comprehensive treatment of the debate between internalists and externalists, see Internalism and Externalism in Epistemology.

6. The Gettier Era

The idea that justification is the crucial link between true belief and knowledge seems to be implicit in epistemology since Plato. In Theatetus, Socrates gives an example of a jury that has been persuaded by hearsay of a true judgment that can only be known by an eye-witness (201b-c). This example shows that “true judgment” is not the same thing as “knowledge,” and, therefore, that some other element is needed. Theatetus suggests that knowledge is true judgment plus a logos—an account or argument. Socrates considers three ways of giving an account of a true judgment but concludes that none is plausible. Nevertheless, from then until now, philosophers have generally thought something like the Theatetus’s suggestion must be right, and most of those accounts have been internalist. Socrates’s own suggestion, in Plato’s Meno, is that knowledge is a type of remembrance of what is true based on direct experience prior to being born. Descartes tries to close the gap between true belief and knowledge with the apprehension of clarity and distinctness. Kant attempts to bring them together with the transcendental apperception of the conditions for the possibility of veridical perception. In each case, the knower is assumed to have direct access to something that explains when true belief is knowledge.

Unfortunately, a thought experiment developed in the 20th century challenges the idea that any internal criteria can distinguish knowledge from accidentally true belief. This thought experiment was named the Gettier Problem after Edmund Gettier, who introduced the most influential examples in a famously brief 1963 paper. Examples from other philosophers proliferated after Gettier’s publication, but each new instance is standardly called a “Gettier Case.”

a. The History of the Gettier Problem

The idea is that there are cases where all three conditions on knowledge are met—a belief is justified and true—and yet that belief fails to be knowledge. Although some traditional internalists have allowed that a false belief can be justified, they have resisted the idea that a belief’s justification does not contribute to the likelihood of knowing. But if Gettier cases are successful, it is possible to be justified (in the classic internalist sense) in holding a true belief without that belief’s being knowledge.

The the broken clock example in §1 is an early version of this problem, constructed by Bertrand Russell (1948). Here is another example Russell includes alongside his clock case:

There is the man who believes, truly, that the last name of the Prime Minister in 1906 began with a B, but believes this because he believes that Balfour was Prime Minister then, whereas in fact it was Campbell-Bannerman. … Such instances can be multiplied indefinitely, and show that you cannot claim to have known merely because you turned out to be right. (171)

The problem, though, contra Russell, is not merely that such a person turns out to be right; it is that the person’s belief is justified in cases where a belief turns out to be true by luck; justified true belief in these cases does not increase the likelihood that the belief is knowledge. The evidence that justifies the belief is not connected with the truth of the belief in the right way, and, recall from the introduction, believing in the right way is precisely the sort of thing justification is supposed to indicate.

Such cases trace at least as far back as Alexius Meinong (1906), but the most famous are Gettier’s. His cases are interesting because they show that such cases can occur even when our evidence includes logical entailment. In his first example, Gettier asks us to imagine that two men, Smith and Jones, have applied for the same job. Imagine also that Smith has very good reasons for believing: “Jones will get the job” and “Jones has 10 coins in his pocket.” From this, it follows logically that: “The man who will get the job has 10 coins in his pocket,” and Smith forms the belief that this is true. As it turns out, however, Smith has 10 coins in his pocket (though he does not know it) and he will get the job. So, Smith’s belief that the man who will get the job has 10 coins in his pocket is true, and he has good reasons for why this is so, but his reasons are unconnected with the real reasons it is true. Most philosophers have concluded that, since Smith’s true belief is just a matter of luck (and not a function of his reasons’ connection with the state of affairs that make it true), Smith does not know that the man who will get the job has 10 coins in his pocket.

Because of the many possible variations on cases like these, the idea that justification is based on evidence to which we have direct access faces a serious challenge. There is no clear sense in which that sort of evidence always or even regularly increases the likelihood that a belief is knowledge.

b. Responses to the Gettier Problem

Some philosophers have tried to save strong internalist justification from Gettier cases. For example, D. M. Armstrong—although he ultimately defends an externalist theory of justification—argues that Gettier cases can be avoided by adding a requirement that all evidence for a belief must be, not merely justified, but also knowledge. In the Gettier case above, since it is false that Jones will get the job, this belief cannot be knowledge for Smith and, therefore, undermines Smith’s ability to know the man who will get the job has 10 coins in his pocket. (See Feldman 1974 for a counterexample.)

Others weaken the requirements on justification by arguing that, while knowledge may have constraints outside our conscious access, justification is more plausibly about responsible or apt belief than truth. Call this weak internalist justification (see Zagzebski, 1996).

Still others argue that Gettier cases suggest either that justification is simply not an internal matter or that knowledge does not require justification. Those who argue that justification is external claim that whether a belief is justified depends on whether there is a law-like connection (conceptual or physical) between a belief and the state of affairs it is about (Bergmann 2006). This approach is externalist because it explains justification in terms of belief-forming processes outside the mental life of the believer. In adopting externalism, some treat internal mental states as irrelevant for justification, while others argue that internal states can play an indirect and partial role in justification. Ernest Sosa (1991), for example, argues that internal states can contribute to the state of affairs that grounds the reliability of certain belief-forming behaviors.

7. Externalist Foundationalism

Gettier cases, in addition to other challenges to internalism, have led some epistemologists to reject the idea that justification requires an internal condition. In its most minimal form, externalism is the view that internalism is false, that is, that some features external to the mental life of a person play a necessary role in justification (Greco 2005: 258). However, many versions of externalism also explicitly reject internal conditions for justification, at least for non-inferential knowledge. Some philosophers have developed externalist accounts of knowledge that lack any account of justification (compare, Goldman, 1967, though he has since given up this view). The debate between externalists and internalists, though, is primarily about justification. Externalist accounts of justification differ from internalist accounts by challenging the idea that justification is primarily or ultimately about good reasons when good reasons are construed as mental states.

To accommodate the external features that connect beliefs with states of the world, externalists modify what was traditionally meant by justification; rather than appealing to a person’s subjective perspective on her evidence, externalists appeal to the objective features of the belief-forming and -holding behavior. Epistemic standing is not about the reasons a person has; it is about the relationship between a belief and the world, how that belief is formed or how it is maintained, and where the relationship is not a guarantee of truth but a strong indicator of truth, typically because of a causal, lawful, conceptual, or counterfactual connection with the states of affairs the belief is about. The most prominent version of externalism is the view that a belief is justified just in case it is caused by a reliable process, where “reliable” means that the process produces more true beliefs than false.

a. Externalism, Foundationalism, and the DIJ

Externalists agree that, to resolve the DIJ, one needs to avoid infinite regress and skepticism. So, rather than grounding justification in other beliefs (as coherentists do) or in non-belief states (as classical foundationalists do), externalists ground justification and knowledge in the objective way the world contributes to belief formation or maintenance.

Some externalists, like Armstrong (1973) and Goldman (1979), make room for something like basic beliefs, from which something like non-basic beliefs are inferred. This means that contemporary externalists tend to accept the foundationalist structure—some beliefs are produced reliably by non-belief states, and some beliefs can be produced by other beliefs—though they reject the distinction between basic and non-basic beliefs. All belief-forming processes are states external to the knower’s mental states, and whether a belief is justified (and, therefore, knowledge) depends on the reliability of those processes.

Unlike classical foundationalists, who appeal to internal seemings, indubitability, or self-evidence as justifying these states, externalists like Goldman argue that these states are knowledge simply because they stand in a reliable relationship with the world. A non-inferential belief is knowledge when and because it is lawfully (Armstrong) or reliably (Goldman) produced.

b. Reliabilism

The concept of reliability is crucial to externalist theories of justification (in contrast to externalist theories of knowledge, for example, Goldman 1967, 1976 and Armstrong 1973). There are two types of reliabilist theories of justification. According to reliable indicator theories, a belief is justified just in case its reason or ground is a reliable indicator of the belief’s truth (Swain 1981 and Alston 1988). According to process reliabilism, a belief is justified just in case it was causally produced by reliable processes (Goldman 1979 and Bach 1985). Although he focuses primarily on externalist theories of knowledge, D. M. Armstrong’s “thermometer theory of knowledge” explains that certain mental states serve as reliable indicators or signs of knowledge, and therefore make the belief reasonable, or “justifiable.” Comparing non-inferential belief and a thermometer, Armstrong writes:

In some cases, the thermometer-reading will fail to correspond to the temperature of the environment. Such a reading may be compared to non-inferential false belief. In other cases, the reading will correspond to the actual temperature. Such a reading is like non-inferential true belief. (166)

There are a number of important qualifications to Armstrong’s view, but the central point is that a belief is justified independently of whether the person has reasons to believe it: “The subject’s belief is not based on reasons, but it might be said to be reasonable (justifiable), because it is a sign, a completely reliable sign, that the situation believed to exist does in fact exist” (183).

The benefit of Armstrong’s law-like account is that it suggests a counterfactual account of causal relations along the following lines: as long as a person has a means of distinguishing a proposition, P, from a mutually exclusive but very similar proposition Q, then the person is justified in believing P. For example, if Judy and Trudy are twins, and when John sees someone who looks like Judy, he would not mistake Trudy for Judy, then Sam is justified in believing that he sees Judy. “But if Sam frequently mistakes Judy for Trudy, and Trudy for Judy, he presumably does not have any way of distinguishing between them” (Goldman 1976: 778).

Unfortunately, reliable indicator theories tend to be overly strict in their analysis of cases. Goldman asks us to consider Oscar, who is standing in an open field and sees a Dachshund, from which he forms the belief that he sees a dog. As it happens, Oscar often mistakes certain dog breeds for wolves, who frequent the field. If he were to see a wolf, he might easily mistake it for a dog. Now, is his seeing a Dachshund a reliable indicator of seeing a dog? Since Oscar would likely believe he is seeing a dog regardless of whether he is seeing a wolf or a Dachshund, reliable indicator theories (at least Armstrong’s) would say his seeing a Dachshund is not a reliable indicator of seeing a dog. Whether this criticism is ultimately successful or whether it applies to all reliable indicator theories, reliable process theories quickly overshadowed interest in this type of reliabilism.

Process reliabilism is the view that a belief is justified just in case it is produced by a reliable cognitive process, where a cognitive process may include either conscious reasoning processes or unconscious mechanisms. As I formulated it earlier in this article, reliabilism is a necessary and sufficient condition for justification (“just in case”), but some reliabilists formulate weaker versions. Goldman treats it as a sufficient condition (though he argues against the plausibility of alternative sufficient conditions): “If S’s believing p at t results from a reliable cognitive belief-forming process (or set of processes), then S’s belief in p at t is justified,” (1979: 13). Kent Bach treats it as only a necessary condition: “The idea, roughly, is that to be justified a belief must be formed as the result of reliable processes…” (1985: 199). Despite these differences, externalists univocally reject internalist conditions as sufficient for justification. This commitment, however, leaves them open to a number of interesting criticisms.

c. Objections to Externalism

Though externalism, putatively, has the advantage of avoiding the Gettier problem (though this is controversial) and several other skeptical concerns and of capturing some important intuitions about knowledge, it faces several serious criticisms. On the basis of these criticisms, some internalists claim that externalists have simply changed the subject altogether and are not really talking about justification.

One famous criticism of externalism is called the generality problem. Earl Conee and Richard Feldman (1998) present an example to demonstrate the problem:

Suppose that Smith has good vision and is familiar with the visible differences among common species of trees. Smith looks out a house window one sunny afternoon and sees a plainly visible a nearby maple tree. She forms the belief that there is a maple tree near the house. Assuming everything else in the example is normal, this belief is justified and Smith knows that there is a maple tree near the house. Process reliabilist theories reach the right verdict about this case only if it is true that the process that caused Smith’s belief is reliable. (372)

Is it reliable? That depends on which process formed the belief. Was it the unique causal set of events leading to that particular belief? If so, it is not reliable, since token, or one-time, events have no historical track record. Reliabilists respond to this challenge by saying it is the type of process that must be reliable in order for a belief to be justified, not the token. If that is right, then we face the problem of determining which type of process formed the belief. Was it the “visually initiated belief-forming process,” the “process of a retinal image of such-and-such specific characteristics leading to a belief that there is a maple tree nearby,” the “process of relying on a leaf shape to form a tree-classifying judgment,” the “perceptual process of classifying by species a tree located behind a solid obstruction,” or any number of others (373)? There are innumerable options, and even if a combination of types were involved, each type would have to meet reliability conditions. Conee and Feldman conclude, “Without a specification of the relevant type, process reliabilism is radically incomplete” (373).

A second objection to externalism is called the New Evil Demon Problem (NEDP) (Cohen and Lehrer 1983). In Descartes’s original evil demon problem, in order to motivate the problem of skepticism, we are asked to consider the possibility that all our current perceptions are the fictitious construction of a being intent on deceiving us such that all our perceptual and intuitive beliefs are false. Putting the thought experiment to a very different purpose, if the evil demon world is possible, we can imagine two worlds: (1) a non-deceptive world, where our perceptions are reliably produced by the world outside of our minds, and (2) an evil demon world, where there are people just like you and me, who have exactly the same mental states that we do but whose perceptions are systematically unreliable—they track nothing of truth at that world. There are no trees, buildings, bodies, and so forth. Whatever actually exists at that world, those people have no perception of it. According to externalists—process reliabilists, in particular—the beliefs of people in the real world are justified and those of people in the demon world are unjustified, despite the fact that their mental lives are identical. Yet it is difficult to imagine that demon world beliefs about looking both ways before crossing the street and getting a second opinion about a medical diagnosis are unjustified. People who believe such things are acting responsibly from their perspective on their evidence. This suggests that reliabilism is not really about justification at all.

A third objection to externalism is what Ernest Sosa (2001) calls the metaincoherence problem, which attempts to show that a person’s belief can be externally reliable while internally unjustified. In the literature, there are two versions of the metaincoherence problem. The first is what I call first-order metaincoherence, which attempts to show that externalism is insufficient for justification. The second is what I call second-order metaincoherence, which challenges the externalist’s reasons for holding externalism.

One famous example of first-order metaincoherence is a thought experiment given in various forms by Laurence BonJour (1985) and Keith Lehrer (1990). Consider Armstrong’s Thermometer Analogy from above. Imagine there was a human thermometer, that is, someone who “undergoes brain surgery by an experimental surgeon who invents a small device which is both a very accurate thermometer and a computational device capable of generating thoughts” (Lehrer 1990: 163). This person, whom Lehrer names Mr. Truetemp, is unaware of the device despite the fact that it regularly causes him to form reliable beliefs that he unreflectively accepts about the temperature. On a given day, he might reliably form and accept the belief that it is 104 degrees Fahrenheit outside. Is this belief knowledge? Lehrer concludes: “Surely not. He has no idea whether he or his thoughts about the temperature are reliable” (164). BonJour concludes similarly, “Part of one’s epistemic duty is to reflect critically upon one’s beliefs, and such critical reflection precludes believing things to which one has, to one’s knowledge, no reliable means of epistemic access” (1985: 42).

The second-order metaincoherence problem is stated by Barry Stroud (1989):

The scientific ‘externalist’ claims to have good reason to believe that his theory is true. It must be granted that if, in arriving at his theory, he did fulfill the conditions his theory says are sufficient for knowing things about the world, then if that theory is correct, he does in fact know that it is. But still, I want to say, he himself has no reason to think that he does have good reason to think that his theory is correct. (321)

The worry is that, since externalists claim that features of the world outside the mental life of a believer ultimately determine whether a belief is justified, then, if externalism is true, externalists have no reason to believe it is true; in fact, they are committed to believing that whether their belief that it is true is justified is outside their ability to determine from within their own perspective. Again, the belief may be externally reliable, but it is internally unjustified.

If these criticisms hit their mark, epistemologists must make some difficult decisions about which approach—internalism or externalism—has the fewest or least pernicious problems. In the 21st century, much work is underway to address these problems. If one remains unconvinced, there are recent developments that attempt to salvage some of the insights of internalism and externalism. A prominent example involves introducing character traits into the conditions for justification. We turn next to this view, called virtue epistemology.

8. Justification as Virtue

Classical theories of justification that imply a normative or belief-guiding dimension are modeled largely on normative ethical theories, whether teleological, or outcome-based, accounts or deontological, or duty-based, accounts. They ask whether people are rationally obligated to, permitted to, or obligated not to hold particular beliefs given their evidence. These are decision-based theories of rational normativity, as opposed to character-based theories. Just as virtue theory offers a non-decision-based alternative in ethics, it also suggests a non-decision-based alternative in epistemology. The attitudes and circumstances under which people form, maintain, and discard beliefs can be described as virtuous or vicious, and just as decision-based theories in epistemology are concerned with rational obligation (as opposed to moral obligation), character-based theories in epistemology are concerned either with intellectual character (as opposed to moral character), or with cognitive faculties understood as traits of a person (such as reason, perception, introspection, and memory). Of course, in matters of normativity, it is not a simple task to distinguish moral dimensions from rational or intellectual ones, but space prevents us from exploring that relationship here.

Virtue theories of justification hold that part of what justifies a belief is the intellectual traits with which a believer forms or holds the belief. Just as a person’s moral virtues contribute to the goodness of an action (kindness, compassion, honesty), a person’s intellectual virtues contribute to the epistemic goodness of a belief. Virtue theorists, however, are sharply divided as to which intellectual virtues are relevant. One prominent view is that justification is a function of those virtues that enhance reliability, that is, they have a strong external component (Sosa 1980; 2007). This view is known as virtue reliabilism.

A second prominent view is that justification is a function of those intellectual virtues that contribute to more general epistemic goods, including intellectual well-being, social trust, and the righting of epistemic injustice. These virtue responsibilists regard the truth-goal in epistemology very differently than both traditional epistemologists and their virtue reliabilist counterparts (Code 1984; Montmarquet 1993; Zagzebski 2000).

a. Virtue Reliabilism

A prominent version of virtue reliabilism is offered by Ernest Sosa (1980) in attempt to resolve the tension between foundationalists and coherentists. Sosa argues that if beliefs are grounded in truth-conductive intellectual virtues (where truth-conducive is conceived in process reliabilist terms), then foundationalists have empirically stable abilities or acquired habits that help explain the connection between sensory experience and non-inferential belief. Further, reliable virtues help explain how justification emerges from a coherent set of beliefs—coherence is a type of intellectual virtue.

What do these intellectual virtues look like for Sosa? Borrowing an example from his (2007), consider an archer who is aiming at a target. In order to be successful, the archer must have a degree of competence, which Sosa calls “adroitness,” and the shot must be accurate. These features are analogous to the epistemic state of having a true belief (accuracy) that is formed on the basis of good evidence (adroitness). These two features alone, though, are insufficient for the person to believe in the right way. The person must also exercise his adroitness in circumstances that increase his likelihood of having accurate beliefs, that is, his shot must be accurate because it is adroit. Sosa calls this third feature “aptness,” “its being true because competent” (2007: 23). Some of these circumstances will be outside the believer’s control—wind gusts in the archer’s case; causal ties to the world in the epistemic case. But some—for example, the virtues—are within the believer’s control.

Sosa explains:

Aptness depends on just how the adroitness bears on the accuracy. The wind may help some, for example…. If the shot is difficult, however, from a great distance, the shot might still be accurate sufficiently through adroitness to count as apt, though with some help from the wind. (2007: 79)

Notice that the role of the wind is analogous to certain external features of a person’s belief-forming state. Nevertheless, intellectual virtues like those mentioned above can increase one’s adroitness and thereby increase the likelihood of accuracy.

Imagine a person who has good evidence that P but who either does not appeal to that evidence when forming the belief that P, appealing instead to, say, wishful thinking, or who appeals to that evidence carelessly, refusing to consider alternatives or just how strong the evidence is. Despite this person’s having good evidence, her belief is not apt because the belief’s truth was not due to the person’s competence with the evidence.

Because of this external dimension, this branch of virtue epistemology is regarded as a form of reliabilism. Unlike externalist foundationalism, however, the reliability condition is not restricted to belief-forming processes; it is also highly dependent on context. Sosa says:

An archer might manifest sublime skill in a shot that does hit the bull’s-eye. This shot is then both accurate and adroit. But it could still fail to be accurate because adroit. The arrow might be diverted by some wind, for example, so that, if conditions remained normal thereafter, it would miss the target altogether. However, shifting winds might then ease it back on track towards the bull’s-eye. (79)

In epistemic cases, the believer must be suitably virtuous such that, under normal conditions, her beliefs are accurate because they are adroit.

b. Virtue Responsibilism

Sosa’s account has been well-received, though there is disagreement as to whether it is sufficient for solving the problems at issue. One prominent criticism is that Sosa does not take his use of virtues far enough. Rather than serving a more basic truth-goal, some argue that virtues should be conceived as central to the epistemic project.

Lorraine Code (1984) coined the term virtue responsibilism in contrast to Sosa’s reliabilism, and it is the view that justification, or rather, being an intellectually responsible agent, is a matter of acting virtuously in the practice of inquiry. Code argues that epistemic responsibility the central intellectual virtue. Similarly, James Montmarquet argues that, “S is subjectively justified in believing p insofar as S is epistemically virtuous in believing p” (1993: 99). This means that virtue responsibilism is internalist through and through.

Not all virtue responsibilists, however, eschew the truth-goal. As Linda Zagzebski explains, “It would not do any good for a person to be attentive, thorough, and careful unless she was generally on the right track” (2009: 82). But unlike externalist foundationalism, “the right track,” according to virtue epistemologists, does not necessarily include producing more true beliefs than false. There is more than one virtuous outcome, for example, in cases of creativity or inventiveness. It may be that “only 5 per cent of a creative thinker’s original ideas turn out to be true,” Zagzebski explains. “Clearly, their truth conduciveness in the sense of producing a high proportion of true beliefs is much lower than that of the ordinary virtues of careful and sober inquiry, but they are truth conducive in that they are necessary for the advancement of knowledge” (2000: 465). This suggests that the conditions under which a subject is justified are highly contingent on changing context and the goal of our epistemic behaviors. And virtue epistemologists argue that this captures the typical contingency of our epistemic lives.

c. Objections to Virtue Epistemology

In addition to internal disputes between virtue reliabilists and responsibilists, there are more serious concerns with the adequacy of virtue epistemology. Virtue reliabilism faces many of the same criticisms that face traditional reliabilism, including the generality problem, the New Evil Demon Problem, and the meta-incoherence problems. Further, although there is an intuitive sense in which a reliably functioning method of forming beliefs is virtuous (in the Aristotelian sense of “excellence”), it is not clear how virtue reliabilism is substantively different from classical reliabilism. To be sure, virtue responsibilists take special pains to explain the roles of context, luck, and the knower’s aptness in forming beliefs, but these do not seem unavailable to traditional reliabilists.

Similarly, virtue responsibilism faces many of the same problems as virtue ethics. There are questions about which intellectual states count as epistemic virtues (different responsibilists have different lists), whether some virtues should be privileged over others (for example, James Montmarquet (1992) argues that epistemic conscientiousness is the preeminent intellectual virtue), and the ontological status of virtues (whether they are real dispositions or simply heuristics for categorizing types of behavior). There are also serious concerns about some extreme versions of responsibilism that completely disconnect intellectual virtue from truth-seeking, as with Code’s account, rendering discussions of intellectual virtue the province of ethics rather than epistemology.

To alleviate some of these concerns, some virtue epistemologists defend a mixed theory, arguing that an adequate virtue epistemology requires both a reliability and a responsibility condition Greco (2000).

A general concern for both types of virtue epistemology is that virtue theory associates justification too closely with the idea of credit or achievement, whether a person has formed beliefs well. Jennifer Lackey (2007, 2009), for example, argues that if knowledge is produced by the virtuous activity of others (like that of a reliable witness) or if knowledge is innate, then it is not obvious how a person’s belief-forming behavior can be virtuous or vicious, as there is no behavior involved. In the case of the reliable witness, a hearer simply accepts on the basis of the witness’s testimony. In the case of innate knowledge, the knower does nothing to increase the likelihood that her beliefs are reliable; they are reliable for reasons outside her epistemic behavior. If these criticisms are right, virtue epistemology may be unable to explain a range of important types of knowledge.

For a more detailed treatment of virtue epistemology, see Virtue Epistemology.

9. The Value of Justification

Each of the theories of justification reviewed in this article presumes something about the value of justification, that is, about why justification is good or desirable. Traditionally, as in the case of Theatetus noted above, justification is supposed to position us to understand reality, that is, to help us obtain true beliefs for the right reasons. Knowledge, we suppose, is valuable, and justification helps us attain it. However, skeptical arguments, the influence of external factors on our cognition, and the influence of various attitudes on the way we conduct our epistemic behavior suggest that attaining true beliefs for the right reason is a forbidding goal, and it may not be one that we can access internally. Therefore, there is some disagreement as to whether justification should be understood as aimed at truth or some other intellectual goal or set of goals.

a. The Truth Goal

All the theories we have considered presume that justification is a necessary condition for knowledge, though there is much disagreement about what precisely justification contributes to knowledge. Some argue that justification is fundamentally aimed at truth, that is, it increases the likelihood that a belief is true. Laurence BonJour writes, “If epistemic justification were not conducive to truth in this way…then epistemic justification would be irrelevant to our main cognitive goal and of dubious worth” (1985: 8). Others argue that there are a number of epistemic goals other than truth and that in some cases, truth need not be among the values of justification. Jonathan Kvanvig explains:

[I]t might be the case that truth is the primary good that defines the theoretical project of epistemology, yet it might also be the case that cognitive systems aim at a variety of values different from truth. Perhaps, for instance, they typically value well-being, or survival, or perhaps even reproductive success, with truth never really playing much of a role at all. (2005: 285)

Given this disagreement, we can distinguish between what I will call the monovalent view, which takes truth as the sole, or at least fundamental, aim of justification, and the polyvalent view (or, as Kvanvig calls it, the plurality view), which allows that there are a number of aims of justification, not all of which are even indirectly related to truth.

b. Alternatives to the Truth Goal

One motive for preferring the monovalent view is that, if truth is not the primary goal of justification—that is, it connects belief with reality in the right way—then one is left only with goals that are not epistemic, that is, goals that cannot contribute to knowledge. The primary worry is that, in rejecting the truth goal, one is left with pragmatism. In response, those who defend polyvalence argue that, in practice, there are other cognitive goals that are (1) not merely pragmatic, and (2) meet the conditions for successful cognition. Kvanvig explains that “not everyone wants knowledge…and not everyone is motivated by a concern for understanding. … We characterize curiosity as the desire to know, but small children lacking the concept of knowledge display curiosity nonetheless” (2005: 293). Further, much of our epistemic activity, especially in the sciences, is directed toward “making sense of the course of experience and having found an empirically adequate theory” (ibid., 294). Such goals can be produced without appealing to truth at all. If this is right, justification aims at a wider array of cognitive states than knowledge.

Another argument for polyvalence allows that knowledge is the primary aim of justification but that much more is involved in justification than truth. The idea is that, even if one were aware of belief-forming strategies that are conducive to truth (following the evidence where it leads; avoiding fallacies), one might still not be able to use those strategies without having other cognitive aims, namely, intellectual virtues. Following John Dewey, Linda Zagzebski says that “it is not enough to be aware that a process is reliable; a person will not reliably use such a process without certain virtues” (2000: 463). As noted above, virtue responsibilists allow that the goal of having a large number of true beliefs can be superseded by the desire to create something original or inventive. Further still, following strategies that are truth-conducive under some circumstances can lead to pathological epistemic behavior. Amélie Rorty, for example, argues that belief-forming habits become pathological when they continue to be applied in circumstances no longer relevant to their goals (Zagzebski, ibid., 464). If this argument is right, then truth is, at best, an indirect aim of justification, and intellectual virtues like openness, courage, and responsibility may be more important to the epistemic project.

c. Objections to the Polyvalent View

One response to the polyvalent view is to concede that there are apparently many cognitive goals that fall within the purview of epistemology but to argue that all of these are related to truth in a non-trivial way. The goal of having true beliefs is a broad and largely indeterminate goal. According to Marian David, we might fulfill it by believing a truth, by knowing a truth, by having justified beliefs, or by having intellectually virtuous beliefs. All of these goals, argues David, are plausibly truth-oriented in the sense that they derive from, or depend on, a truth goal (David 2005: 303). David supports this claim by asking us to consider which of the following pairs is more plausible:

A1. If you want to have TBs [true beliefs] you ought to have JBs [justified beliefs].

A2. We want to have JBs because we want to have TBs.

B1. If you want to have JBs you ought to have TBs.

B2. We want to have TBs because we want to have JBs. (2005: 303)

David says, “[I]t is obvious that the A’s [sic] are way more plausible than the B’s. Indeed, initially one may even think that the B’s have nothing going for them at all, that they are just false” (ibid.). This intuition, he concludes, tells us that the truth-goal is more fundamental to the epistemic project than anything else, even if one or more other goals depend on it.

Almost all theories of epistemic justification allow that we are fallible, that is, that our justified beliefs, even if formed by reliable processes, may sometimes be false. Nevertheless, this does not detract from the claim that the aim of justification is true belief, so long as it is qualified as true belief held in the right way.

d. Rejections of the Truth Goal

In spite of these arguments, some philosophers explicitly reject the truth goal as essential to justification and cognitive success. Michael Williams (1991), for example, rejects the idea that truth even could be an epistemic goal when conceived of as “knowledge of the world.” Williams argues that in order for us to have knowledge of the world, there must be a unified set of propositions that constitute knowledge of the world. Yet, given competing uses of terms, vague domains of discourse, the failure of theoretical explanations, and the existence of domains of reality we have yet to encode into a discipline, there is not a single, unified reality to study. Williams argues that because of this, we do not necessarily have knowledge of the world:

All we know for sure is that we have various practices of assessment, perhaps sharing certain formal features. It doesn’t follow that they add up to a surveyable whole, to a genuine totality rather than a more or less loose aggregate. Accordingly, it does not follow that a failure to understand knowledge of the world with proper generality points automatically to an intellectual lack. (543)

In other words, our knowledge is not knowledge of the world—that is, access to a unified system of true beliefs, as the classical theory would have it. It is knowledge of concepts in theories putatively about the world, constructed using semantic systems that are evaluated in terms of other semantic systems. If this is, in fact, all there is to knowing, then truth, at least as classically conceived, is not a meaningful goal.

Another philosopher who rejects the truth goal is Stephen Stich (1988; 1990). Stich argues that, given the vast amount of disagreement among novices and experts about what counts as justification, and given the many failures of theories of justification to adequately ground our beliefs in anything other than calibration among groups of putative experts, it is simply unreasonable to believe that our beliefs track anything like truth. Instead, Stich defends pragmatism about justification, that is, justification just is practically successful belief; thus, truth cannot play a meaningful role in the concept of justification.

A response to both views might be that, in each case, the truth goal has not been abandoned but simply redefined or relocated. Correspondence theories of truth take it that propositions are true just in case they express the world as it is. If the world is not expressible propositionally, as Williams seems to suggest, then this type of truth is implausible. Nevertheless, a proposition might be true in virtue of being an implication of a theory, and so, for example, we might adopt a more semantic than ontological theory of truth, and it is not clear whether Williams would reject this sort of truth as the aim of epistemology.

Similarly, someone might object to Stich’s treating pragmatism as if it is not truth-conductive in any relevant sense. If something is useful, it is true that it is useful, even in the correspondence sense. Even if evidence does not operate in a classical representational manner, the success of beliefs in accomplishing our goals is, nevertheless, a truth goal. (See Kornblith 2001 for an argument along these lines.)

10. Conclusion

Epistemic justification is an evaluative concept about the conditions for right or fitting belief. A plausible theory of epistemic justification must explain how beliefs are justified, the role justification plays in knowledge, and the value of justification. A primary motive behind theories of justification is to solve the dilemma of inferential justification. To do this, one might accept the inferential assumption and argue that justification emerges from a set of coherent beliefs (internalist coherentism) or an infinite set of beliefs (infinitism). Alternatively, one might reject the inferential assumption and argue that justification derives from basic beliefs (internalist foundationalism) or through reliable belief-forming processes (externalist reliabilism). If none of these views is ultimately plausible, one might pursue alternative accounts. For example, virtue epistemology introduces character traits to help avoid problems with these classical theories. Other alternatives include hybrid views, such as Conee and Feldman’s (2008), mentioned above, and Susan Haack’s (1993) foundherentism.

11. References and Further Reading

  • Aikin, S. 2009. “Don’t Fear the Regress: Cognitive Values and Epistemic Infinitism.” Think, 23, 55-61.
  • Aikin, S. F. 2011. Epistemology and the Regress Problem. London: Routledge.
  • Alston, W. P. 1988. “An Internalist Externalism,” Synthese, 74, 265-283.
  • Alston, W. P. 1989. Epistemic Justification. Ithaca: Cornell University Press.
  • Armstrong, D. M. 1973. Belief, Truth, and Knowledge. Cambridge: Cambridge University Press.
  • Bach, K. 1985. “A Rationale for Reliabilism.” The Monist, 68, 246-63. Reprinted in S. Bernecker and F. Dretske, eds. 2000. Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 199-213. Cited pages are to this anthology.
  • Bergmann, M. 2006. Justification Without Awareness. New York: Oxford.
  • Blanshard, B. 1939. The Nature of Thought. London: Allen & Unwin.
  • BonJour, L. 1980. “Externalist Theories of Empirical Knowledge.” Midwest Studies in Philosophy 5: Studies in Epistemology. Minneapolis: University of Minnesota Press, 53-73.
  • BonJour, L. 1985. The Structure of Empirical Knowledge. Cambridge: Harvard University Press.
  • BonJour, L. and E. Sosa. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Malden: Wiley-Blackwell.
  • Chisholm, R. 1966. Theory of Knowledge. Englewood Cliffs: Prentice Hall.
  • Chisholm, R. 1982. “A Version of Foundationalism,” in The Foundations of Knowing, ed. R. Chisholm. Minneapolis: University of Minnesota Press.
  • Chisholm, R. 1989. Theory of Knowledge, 3rd ed. Englewood Cliffs: Prentice Hall.
  • Code, L. 1984. “Toward a ‘Responsibilist’ Epistemology.” Philosophy and Phenomenological Research, 45 (1), 29–50.
  • Cohen, S. and K. Lehrer. 1983. “Justification, Truth, and Knowledge.” Synthese 55 (2), 191-207.
  • Conee, E. and R. Feldman. 2004. Evidentialism. New York: Oxford University Press.
  • Conee, E. and R. Feldman. 1998. “The Generality Problem for Reliabilism,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 372-386. Page numbers are to this anthology.
  • Dancy, J. 1985. Introduction to Contemporary Epistemology. Oxford: Basil Blackwell.
  • David, M. 2005. “Truth as the Primary Epistemic Goal: A Working Hypothesis,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 296-312.
  • Davidson, D. “A Coherence Theory of Truth and Knowledge,” in E. Sosa and J. Kim, eds. Epistemology: An Anthology. Malden: Blackwell Publishers, 154-63.
  • Elgin, C. 2005. “Non-foundationalist Epistemology: Holism, Coherence, and Tenability,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 156-67.
  • Feldman, R. 1974. “An Alleged Defect in Gettier Counter-examples.” The Australasian Journal of Philosophy. 52, 68-69.
  • Feldman, R. 1988. “Having Evidence,” in Philosophical Analysis, ed. D. F. Austin. Kluwer Academic Publishers, 83-104.
  • Feldman, R. 2005. “Justification Is Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 270-84.
  • Fitelson, B. 2003. “A Probabilistic Measure of Coherence.” Analysis, 63, 194–199.
  • Frankfurt, H. 1973/2008. Demons, Dreamers, and Madmen: The Defense of Reason in Descartes’s Meditations. Princeton: Princeton University Press.
  • Gettier, E. 1963. “Is Justified True Belief Knowledge?” Analysis. 23, 121-23.
  • Ginet, C. 2001. “Deciding to Believe,” in Knowledge, Truth and Duty, ed. Matthias Steup. Oxford: Oxford University Press, 63-76.
  • Ginet, C. 2005. “Infinitism Is not the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds. Matthias Steup and Ernest Sosa. Malden: Blackwell Publishing, 140-149.
  • Goldman, A. 1967. “A Causal Theory of Knowing.” The Journal of Philosophy, 64, 357-72.
  • Goldman, A. 1976. “Discrimination and Perceptual Knowledge.” The Journal of Philosophy, 73, 771-91.
  • Goldman, A. 1979. “What Is Justified Belief?” in Knowledge and Justification, ed. George S. Pappas. Dordrecht, Holland: D. Reidel Publishing, 1-23.
  • Greco, J. 2005. “Justification Is Not Internal,” in Contemporary Debates in Epistemology, eds. Mattias Steup and
  • Ernest Sosa. Malden: Blackwell Publishing, 257-70.
  • Haack, S. 1993. Evidence and Inquiry. Malden: Blackwell Publishing.
  • Harman, G. 1986. Change in View. Cambridge: MIT Press.
  • Heil, J. 1983. “Doxastic agency.” Philosophical Studies, 43 (3), 355-364.
  • Hempel, C. 1935. “On the Logical Positivist’s Theory of Truth.” Analysis, 2 (4), 49-59.
  • Klein, P. 2005. “Infinitism Is the Solution to the Regress Problem,” in Contemporary Debates in Epistemology, eds.
  • Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 131-40.
  • Klein P. 2014. “No Final End in Sight,” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 95-115.
  • Klein, P. and T. A. Warfield. 1994, “What Price Coherence?” Analysis, 54, 129–132.
  • Kornblith, H. 2001. Knowledge and Its Place in Nature. Oxford: Oxford University Press.
  • Kvanvig, J. L. and W. D. Riggs. 1992. “Can a Coherence Theory Appeal to Appearance States?” Philosophical Studies, 67, 197-217.
  • Kvanvig, J. 2005. “Truth Is not the Primary Epistemic Goal,” in Contemporary Debates in Epistemology, eds.
  • Mattias Steup and Ernest Sosa. Malden: Blackwell Publishing, 285-96.
  • Lackey, J. 2007. “Why we don’t deserve credit for everything we know.” Synthese, 158, 345–361.
  • Lackey, J. 2009. “Knowledge and credit,” Philosophical Studies, 142, 27–42.
  • Lehrer, K. 1974. Knowledge. Oxford: Clarendon Oxford Press.
  • Lehrer, K. 1986. “The Coherence Theory of Knowledge.” Philosophical Topics, 14, pp. 5-25.
  • Lehrer K. 1990. Theory of Knowledge. Boulder: Westview Press.
  • Lewis, C. I. 1946. An Analysis of Knowledge and Valuation. LaSalle: Open Court.
  • McGrew, T. 1995. The Foundations of Knowledge. Lanham: Rowman & Littlefield.
  • Meinong, A. 1906. “Über die Erfahrungsgrundlagen unseres Wissens” [“On the Experiential Foundations of Our Knowledge”], in Abhandlungen zur Didaktik und Philosophie der Naturwissenschaften, Band [Vol.] I, Heft [Issue] 6, Berlin: J. Springer. Reprinted in Meinong 1968–78, Vol. V: 367–481.
  • Montmarquet, J. 1987. “Epistemic Virtue.” Mind, 96, 482–497.
  • Neurath, O. 1983/1932, “Protocol Sentences.” In R.S. Cohen and M. Neurath, eds. Philosophical Papers 1913–       1946. Dordrecht: Reidel.
  • Olsson, E. J. 2009. Against Coherence: Truth, Probability, and Justification. Oxford: Oxford University Press.
  • Plantinga, A. 1983. “Reason and Belief in God,” in A. Plantinga and N. Wolterstorff, eds. Faith and Rationality. Notre Dame: University of Notre Dame Press, 16-93.
  • Plantinga, A. 1993a. Warranted Christian Belief. New York: Oxford.
  • Plantinga, A. 1993b. Warrant: The Contemporary Debate. New York: Oxford.
  • Pollock, J. 1986. Contemporary Theories of Knowledge. Lanham: Rowman & Littlefield Publishers.
  • Poston, T. 2014. Reason and Explanation: A Defense of Explanatory Coherentism. Hampshire: Palgrave Macmillan.
  • Quine, W.V.O. 1970. Web of Belief. Cambridge: Harvard University Press.
  • Russell, B. 1948. Human Knowledge: Its Scope and Value. London: Routledge.
  • Smithies, D. 2014. “Can Foundationalism Solve the Regress Problem?” in Current Controversies in Epistemology, ed. R. Neta. London: Routledge, 73-94.
  • Sosa, E. 1980. “The Raft and the Pyramid: Coherence Versus Foundations in the Theory of Knowledge.” Midwest Studies in Philosophy, 5 (1), 3–26.
  • Sosa, E. 1991. “Reliabilism and intellectual virtue” in Knowledge in Perspective: Selected Essays in Epistemology. New York: Cambridge University Press, 131-145.
  • Sosa, E. 2001. “Reliabilism and Intellectual Virtue,” in Epistemology: Internalism and Externalism, ed. Hilary Kornblith. Malden: Blackwell Publishers, 147-62.
  • Sosa, E. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume 1. Oxford: Oxford University Press.
  • Stich, S. 1988. “Reflective Equilibrium, Analytic Epistemology, and the Problem of Cognitive Diversity.” Synthese, 74, 391-413.
  • Stich, S. 1990. The Fragmentation of Reason. Cambridge: The MIT Press.
  • Stroud, B. 1989. “Understanding Human Knowledge in General,” in Knowledge and Skepticism, Marjorie Clay and Keith Lehrer, eds. Boulder: Westview, 31-50.  Reprinted in S. Bernecker and F. Dretske, eds. 2000.   Knowledge: Readings in Contemporary Epistemology. Oxford: Oxford University Press, 307-323. Page numbers to this anthology.
  • Swain, Marshall. 1981. Reasons and Knowledge. Ithaca: Cornell University Press.
  • Williams, M. 2000. “Epistemological Realism,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden:     Blackwell Publishers, 536-555.
  • Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundation of Knowledge. Cambridge: Cambridge University Press.
  • Zagzebski, L. 2000. “Virtues of the Mind,” in Epistemology: An Anthology, eds. E. Sosa and J. Kim. Malden: Blackwell Publishers, 457-467.
  • Zagzebski, L. 2009. On Epistemology. Belmont: Wadsworth.

 

Author Information

Jamie Carlin Watson
Email: jamie.c.watson@gmail.com
Broward College
U. S. A.

Laozi (Lao-tzu, fl. 6th cn. B.C.E.)

laoziLaozi is the name of a legendary Daoist philosopher, the alternate title of the early Chinese text better known in the West as the Daodejing, and the moniker of a deity in the pantheon of organized “religious Daoism” that arose during the later Han dynasty (25-220 C.E.). Laozi is the pinyin romanization for the Chinese characters which mean “Old Master.” Laozi is also known as Lao Dan (“Old Dan”) in early Chinese sources (see Romanization systems for Chinese terms). The Zhuangzi (late 4th century B.C.E.) is the first text to use Laozi as a personal name and to identify Laozi and Lao Dan. The earliest materials to mention Laozi are in the Zhuangzi’s Inner Chapters (Chs. 1-7) in the narration of Lao Dan’s funeral in Ch. 3. Two other passages provide support for the linkage of Laozi and Lao Dan (in Ch. 14 and Ch. 27). There are seventeen passages in which Laozi plays a role in the Zhuangzi. Three are in the Inner Chapters, eight occur in chapters 11-14 in the Yellow Emperor sections of the text (chs. 11, 12, 13, 14), five are in chapters likely belonging to Zhuang Zhou’s disciples as the sources (chs. 21, 22, 23, 25, 27), and one is in the final concluding editorial chapter (ch. 33). In the Yellow Emperor sections in which Laozi is the main figure, four passages contain direct attacks on Confucius and the Confucian virtues of ren, yi, and li in the form of dialogues. The sentiments expressed by Laozi in these passages are reminiscent of remarks from the Daodejing and probably date from the period in which that collection was reaching some near final form.  Some of these themes include the advocacy of wu-wei, rejection of discursive reasoning and mind meddling, condemnation of making discriminations, and valorization of forgetting and fasting of the mind. The earliest ascriptions of authorship of the Daodejing to Laozi are in Han Feizi and the Huainanzi.  Over time, Laozi became a principal figure in institutionalized forms of Daoism and he was often associated with the many transformations and incarnations of the dao itself.

Table of Contents

  1. Laozi and Lao Dan in the Zhuangzi
  2. Laozi and the Daodejing
  3. The First Biography and the Establishment of Laozi as the Founder of Daoism
  4. The Ongoing Laozi Myth
  5. References and Further Reading

1. Laozi and Lao Dan in the Zhuangzi

The Zhuangzi gives the following, probably fictional, account of Confucius‘s impression of Laozi:

“Master, you’ve seen Lao Dan—what estimation would you make of him?” Confucius said, “At last I may say that I have seen a dragon—a dragon that coils to show his body at its best, that sprawls out to display his patterns at their best, riding on the breath of the clouds, feeding on the yin and yang. My mouth fell open and I couldn’t close it; my tongue flew up and I couldn’t even stammer. How could I possibly make any estimation of Lao Dan!” Zhuangzi, Ch. 14

Laozi’s relationship to Confucius is a major part of the Zhuangzi‘s picture of the philosopher. Of the seventeen passages mentioning Laozi, Confucius figures as a dialogical partner or subject in nine. While it is clear that Confucius is thought to have a long way to go to become a zhenren (the Zhuangzi‘s way of speaking about the perfected person), Lao Dan seems to feel sorry for Confucius in his reply to Wuzhi “No-Toes” in Ch. 5, The Sign of Virtue Complete. Laozi recommends to Wuzhi that he try to release Confucius from the fetters of his tendency to make rules and human discriminations (for example, right/wrong; beautiful/ugly) and set him free to wander with the dao.

Lao Dan addresses Confucius by his personal name “Qiu” in three passages. Since such a liberty is one that only a person with seniority and authority would take, this style invites us to believe that Confucius was a student of Lao Dan’s and thereby acknowledged Laozi as an authority. In one of these passages in which Lao Dan uses Confucius’s personal name Qiu, he cautions Confucius against clever arguments and making plans and strategies with which to solve life’s problems, telling him that such rhetoricians are simply like nimble monkeys and rat catching dogs who are set aside when unable to perform (Ch. 12, Heaven and Earth). And on another occasion, Qiu claims that he knows the “six classics” thoroughly and that he has tried to persuade 72 kings to their truth, but they have been unmoved. Lao Dan’s reply is, “Good!” He tells Confucius not to occupy himself with such worn out ways, and to instead live the dao himself (Ch.14, Turning of Heaven).

In his later attempt to provide an actual biography of Laozi by Sima Qian (see below), Laozi’s vocation as a librarian figures prominently.  If the ultimate source of this tradition is the Zhuangzi, we should not forget that the context of this record is as a component in the theme that Laozi taught Confucius, who was confused and having no success with his own teachings.  Accordingly, the point of the story that mentions Laozi’s occupation as librarian or an archivist (ch. 13) is that Confucius’ writings, offered to Laozi by Confucius himself, are simply not worthy to be put into a library. We cannot be sure, then, that there is any real memory of Confucius’s occupation being preserved for us, as the story may be an entire fiction meant to make a point about the inadequacy of Confucius’s teachings.

Finally, in Ch.14, Turning of Heaven, Lao Dan makes a direct attack not only on the rules and regulations of Confucius, but also the teachings of the Mohists, and the veneration of the ancient emperors and legendary sages of the past, displaying his preference for experiential oneness with dao to any teaching or tradition of philosophers or great minds of the past.

2. Laozi and the Daodejing

The ways in which expressions of Laozi in the seventeen passages in which he occurs in the Zhuangzi sound like sentiments in the Daodejing (hereafter, DDJ) represent collectively one basis for the traditional association of Laozi as author of the text.  For example, at Laozi’s funeral in Ch. 3, Qin Shi valorizes Laozi by saying that he accomplished much, without appearing to do so, which is a reference both to the Old Master’s rejection of pursuit of fame and power and also praise for his conduct as wu-wei (effortless action) in oneness with dao. Qin Shi’s praise of Laozi is also consistent with Laozi’s teaching to Yangzi Ju in Ch. 7 not to seek fame and power.  Such conduct and attitudes are encouraged strongly in DDJ 2, 7, 22, 24, 51 and 77.  When Laozi tells Wuzhi to return to Confucius and set him free from the disease of problematizing life and tying himself in knots by helping him to empty himself of making discriminations (Zhuangzi ch. 5), this same teaching shows up in the DDJ in many places (for example, chs. 5 and 18).  Likewise, Laozi criticizes Confucius for trying to spread the classics (12 in number in ch. 13 and 6 in ch. 14) instead of valuing the wordless teaching, the DDJ has a ready parallel in Ch. 56.  While Confucius is teaching his disciples to put forth effort and cultivate benevolence (ren) and appropriate conduct (yi),  Laozi tells him that he should be teaching effortless action (wu-wei) in Zhuangzi chs. 13, 14, and 21).  This teaching also shows up in the DDJ (chs. 2, 3, 20, 47, 48, 57, 63, and 64). Finally, if we take Zhuangzi Ch. 33 as an original part of the work, then Lao Dan (Laozi) actually quotes DDJ 28.

In addition to the ways in which Laozi’s teachings in the Zhuangzi sound like those of the DDJ, we should also note that both of the very early classical works known as the Hanfeizi and the Huainanzi contain passages that are direct quotes or unmistakable allusions to teachings in the DDJ and attribute them to Lao Dan or Laozi by name.  Tae Hyun Kim has made a study of these passages in Hanfeizi and the recent English translation of Huainanzi by John Major and others makes it easy to locate these citations (for example, see Huainanzi, 11.3).  All of these connections culminate in Sima Qian’s biography of Laozi (see below) which not only says that Laozi was the author of the DDJ, but explains that it was a written text of Laozi’s teachings given when he departed China to go to the West.  So, by the 1st Cent. B.C.E., it was accepted by tradition and lore that Laozi was the author of the DDJ.

However, the attribution of authorship of the DDJ to Laozi is much more complicated than it first appears.  The DDJ has 81 chapters and about 5,000 Chinese characters, depending on which text is used. Its two major divisions are the dao jing (chs. 1-37) and the de jing (chs. 38-81). But actually, this division probably rests on nothing other than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). Moreover, although the text has been studied by commentators in Chinese history for centuries, the general reverence shown to it, and the long standing tradition that it was the work of the great philosopher Laozi, were two factors militating against any critical literary analysis of its structure. What we know now is that in spite of the view that the text had a single author named Laozi, it is clear to textual critics that the work is a collection of smaller passages edited into sections and not the work of a single hand. Most of these probably circulated orally, perhaps as single teachings or in small collections. Later they were gathered and arranged by an editor.

The internal structure of the DDJ is only one ground for the denial of a single author for the text.  The fact that we also now know there were multiple versions of the DDJ, even as early as 300 B.C.E., also suggests that it is unlikely that a single author wrote just one book that we now know as the DDJ.  Consider that for almost 2,000 years the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who made a complete edition of the DDJ sometime between 226-249 C.E. Although Wang Bi was not a Daoist, the commentary he wrote after collecting and editing the text became a standard interpretive guide, and generally speaking even today scholars depart from his arrangement of the actual text only when they can make a compelling argument for doing so. However, based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we have no doubt that there were several simultaneously circulating versions of the DDJ text that pre-dated Wang Bi’s compilation of what we now call the “received text.”

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries include two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching.

The Guodian find consists of 730 inscribed bamboo slips found in a tomb near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding only to Chapters 1-66. Based on the probable date of the closing of the tomb, the version of the DDJ found within it may date as early as c. 300 B.C.E.

3. The First Biography and the Establishment of Laozi as the Founder of Daoism

We have now arrived at the stage where studies of Laozi’s biography usually begin.

The first known attempt to write a biography of Laozi is in the Shiji (Historical Records) by Sima Qian (145-89 B.C.E.). According to this text, Laozi was a native of Chu, a southern state of the Zhou dynasty.  His surname was Li, and his personal name was Er, and his style name was Dan. Sima Qian reports that Laozi was a historiographer in charge of the archives of Zhou.  Moreover, Sima Qian tells us that Confucius had traveled to see Laozi to learn about the performance of rituals from him.  According to The Book of Rites (Liji), a master known as Lao Dan was an expert on mourning rituals. On four occasions, Confucius (Kongzi, Master Kong) is reported to have responded to questions by appealing to answers given by Lao Dan. The records even say that Confucius once assisted him in a burial service.  Just what date we can put on this record from The Book of Rites is uncertain, but it may have informed Sima Qian’s biography.

According to the biography, during the course of their conversations Laozi told Confucius to give up his prideful ways and seeking of power.  When Confucius returned to his disciples, he told them that he was overwhelmed by the commanding presence of Laozi, which was like that of a mighty dragon.  The biography goes on to say that Laozi cultivated the dao and its de. However, as the state of Zhou continued to decline, Laozi decided to leave China through the Western pass (toward India) and that upon his departure he gave to the keeper of the pass, one Yin Xi, a book divided into two parts, one on dao and one on de, and of 5,000 characters in length.  After that, no one knew what became of him.  This is perhaps the most familiar of the traditions narrated by Sima Qian and it contains the core of most every subsequent biography or hagiography of Laozi of significance.  However, the biography did not end here.  Sima Qian went on to record what other sources said about Laozi.

In the first biography, Sima Qian says some report that Laolaizi came from Chu, was a contemporary of Confucius, and he authored a work in fifteen sections which speaks of the practical uses of the Daoist teachings.  But Sima Qian leaves it undecided whether he thinks Laolaizi should be identified with Laozi, even if he does include this reference in the section on Laozi.

Sima Qian adds another layer to the biography without commenting on the degree of confidence he has in its truthfulness, according to which it is said that Laozi lived 160 years or even 200 years, as a result of cultivating the dao and nurturing his longevity.

An additional tradition included in the first biography is that Dan, the historiographer of Zhou predicted in 479 B.C.E. that Zhou and Qin would break apart and that a new king would arise from Qin.  The point of this tradition is that Dan (Lao Dan?) had the power to predict the political future of the people, including the fragmentation of the Zhou dynasty and the rise of the Qin in about 221 B.C.E. (that is, Qinshihuang, or the first emperor of China). But Sima Qian likewise refuses to identify Laozi with this Dan.

Finally, the first biography concludes with a reference to Laozi’s son and his descendants. Another movement in the evolution of the Laozi story was completed by about 240 B.C.E. This was necessitated by Lao Dan’s association with the grand historiographer Dan during the Zhou, who predicted the rise of the Qin state. This information, along with that of Laozi’s journey to the West, and of the writing of the book for Yin Xi won a favorable position for Laozi during the Qin dynasty. The association of Laozi with a text (the DDJ) that was becoming increasingly significant was important. However, with the demise of the Qin state, some realignment of Laozi’s connection with them was needed. So, Qian’s final remarks about Laozi’s son helped to associate the philosopher’s lineage with the new Han ruling family. The journey to the West component now also had a new force. It explained why Laozi was not presently advising the Han rulers.

Overall, it seems that the earliest biography conforms closely to passages contained in Zhuangzi Chapters 11-14 and 26 in associating Laozi with the archivist or historiographer of Zhou, Laozi’s rebuke of Confucius’s prideful seeking of fame and pursuit of power, and the report that Confucius told his disciples that Laozi was like a great dragon. It is possible, then, that Zhuangzi is thus the ultimate source of Sima Qian’s information.

Sima Qian also says, “Laozi cultivated the dao and its virtue (de).” We recognize of course that “dao and its virtue” is Dao and de and that this phase is meant to solidify Laozi’s association with the Daode jing. What the Zhuangzi, Hanfeizi and Huainanzi only alluded to by putting near quotes from the DDJ in the mouth of Laozi, Sima Qian now makes into an explicit connection.  He even tells us that when the Zhou kingdom began to decline, Laozi decided to leave China and head into the West. When he reached the mountain pass, the keeper of the pass (Yin Xi) insisted that he write down his teachings, so that the people would have them after he left. So, “Laozi wrote a book in two parts, discussing the ideas of the dao and of de in some 5,000 words, and departed. No one knows where he ended his life.” These remarks make an unmistakable connection between what Laozi is said to have delivered to Yin Xi and the two sectional divisions of the DDJ and a very close approximation to its exact number of characters.

Sima Qian classified the Six Schools as Yin-Yang, Confucian, Mohist, Legalists, School of Names, and Daoists. Since his biography located Laozi in a time period predating the Zhuangzi, and the passages in the Zhuangzi seemed to be about a person who lived in the time of Confucius (and not to be simply a literary or traditional invention), then the inference was easy to make that Laozi was the founder of the Daoist school.

4. The Ongoing Laozi Myth

In The Lives of the Immortals (Liexuan zhuan) by Liu Xiang (79-8 B.C.E.) there are separate entries for Laozi and Yin Xi. According to the extension of the story of Laozi’s leaving China through the Western pass found in Liu Xiang’s work, Yin became a disciple of Laozi and begged him to allow him to go to the West as well. Laozi told him that he could come along, but only after he cultivated the dao. Laozi instructed Yin to study hard and await a summons which would be delivered to him in the marketplace in the city of Chengdu. There is now a shrine at the putative location of this site dedicated to “ideal disciple.” Additionally, in Liu Xiang’s text it is clear that Laozi is valorized as the preeminent immortal and as a superior daoshi (fangshi) who had achieved not only immortality through wisdom and the practice of techniques for longevity, but also mastery of the arts associated with the abilities and skills of one who was united with dao (compare the “Spirit Man” living in the Gushe mountains in Zhuangzi ch. 1 and Wang Ni’s remarks on the perfected person or zhenren in ch. 2).

Another important stage in the development of Laozi’s place in Chinese philosophical history occurred when Emperor Huan (147-167 C.E.) built a palace on the traditional site of Laozi’s birthplace and authorized veneration and sacrifice to Laozi. The “Inscription to Laozi” (Laozi ming) written by Pian Shao in c. 166 C.E. as a commemorative marker for the site goes well beyond Sima Qian’s biography. It makes the first apotheosis of Laozi into a deity. The text makes reference to the many cosmic metamorphoses of Laozi, portraying him as having been counselor to the great sage kings of China’s misty pre-history. Accordingly, during this period of the 2nd and 3rd centuries, the elite at the imperial court divinized Laozi and regarded him as an embodiment or incarnation of the dao, a kind of cosmic emperor who knew how to bring things into perfect harmony and peace by acting in wu-wei.

The Daoist cosmological belief in the powers of beings who experienced unity with the dao to effect transformation of their bodies and powers (for example, Huzi in Zhuangzi, ch.7) was the philosophical underpinning of the work, Classic on the Transformations of Laozi (Laozi bianhua jing, late 100s C.E., available now in a Dunhuang manuscript dating 612 C.E.). This work reflects some of the ideas in Pian Shao’s inscription, but takes them even further. It tells how Laozi transformed into his own mother and gave birth to himself, taking quite literally comments in the DDJ where the dao is portrayed as the mother of all things (DDJ, ch. 1). The work associates Laozi with various manifestations or incarnations of the dao itself.  In this text there is a complete apotheosis of Laozi into a numinal divinity. “Laozi rests in the great beginning, wanders in the great origin, floats through dark, numinous emptiness…He joins serene darkness before its opening, is present in original chaos before the beginnings of time….Alone and without relation, he has existed since before heaven and earth. Living deeply hidden, he always returns to be. Gone, the primordial; Present, a man” (Quoted in Kohn, “Myth,” 47). The final passage in this work is an address given by Laozi predicting his reappearance and promising liberation from trouble and the overthrow of the Han dynasty, an allusion that helps us fix the probable date of origin for the work.   The millennial cults of the second century believed Laozi was a messianic figure who appeared to their leaders and gave them instructions and revelations (for example, the hagiography of Zhang Daoling, founder of the Celestial Master Zhengyi movement contained in the 5th century work, Taiping Guangji 8).

The period of the Celestial Masters (c. 142-260 C.E.) produced documents enhancing the myth of Laozi who came then to be called Laojun (Lord Lao) or Taishang Laojun (Most High Lord Lao). Laojun could manifest himself in any time of unrest and bring Great Peace (taiping). Yet, the Celestial Masters never claimed that Laojun had done so in their day. Instead of such a direct manifestation, the Celestial Masters practitioners taught that Laojun transmitted to them talismans, registers, and new scriptures in the form of texts to guide the creation of communities of heavenly peace.  One work, very likely from the late 3rd or early 4th century C.E. entitled The Hundred and Eight Precepts Spoken by Lord Lao (Laojun shuo yibai bashi jie) became the earliest set of behavioral guides for Celestial Masters communities.  According to the text, Laozi delivered these precepts after returning from India and finding the people in a state of corruption.

During the reign of Emperor Huidi of the Western Jin dynasty (290-306 C.E.), Wang Fu, a master within the Daoist sectarian group known as the Celestial Masters, often debated with the Buddhist monk Bo Yuan about philosophical beliefs.  As a result of these exchanges, scholarly consensus holds that Wang Fu compiled a one scroll work entitled Classic of the Conversion of the Barbarians (Huahu jing, c. 300 C.E.).  The work is also known by the title The Supreme Numinous Treasure’s Sublime Classic on Laozi’s Conversion of the Barbarians (Taishang lingbao Laozi huahu miaojing). Perhaps the most inflammatory claim of this work was its teaching that when Laozi left China through the Western pass he went to India, where he transmorphed into the historical Buddha and converted the barbarians. The basic implication of the book was that Buddhism was actually only a form of Daoism.  This work inflamed Buddhists for decades.  In fact, both of the Tang Emperors Gaozong (649-683 C.E.) and Zhongzong (705-710 C.E.) gave imperial orders to prohibit its distribution. However, as bitter contention continued between Buddhism and Daoism, the Daoists actually expanded the Classic of the Conversion of the Barbarians, so that by 700 C.E. it was ten scrolls in length.  Four of these were recovered in the Dunhuang cache of manuscripts.  The much extended work came to include the account that Laozi entered the mouth of a queen in India and the next year was born from her right arm-pit to become the Buddha. He walked immediately after his birth, and “from then on Buddhist teaching came to flourish.”  To those familiar with the hagiographies of the Buddha, virtually all of this birth account is recognizable as associated with Buddha, not Laozi.

In the course of the production of polemical writings on the Buddhist side of the debate, attempts were made to turn the tables on the Daoists.  Laozi was portrayed as a bodhisattva or disciple of the Buddha sent to convert the Chinese.  This theory had other desirable extensions from a Buddhist viewpoint, because it was also applied to Confucius, enabling Buddhist rhetoricians to hold that Confucius was an avatar of Buddhism and that Confucianism was actually a form of distorted Buddhism.

Most later writings about Laozi continued to base their appeals to Laozi’s authority on his ongoing transformations, but they likewise provide evidence of the growing tension between Daoism and Buddhism. The first mythological account of Laozi’s birth is in the Classic of the Inner Explanation of the Three Heavens (Santian neijie jing), a Celestial Master work dated about 420 C.E. In this text, Laozi has three births: as the manifestation of the dao from pure energy to become a deity in heaven; in human form as the ancient philosopher author of the DDJ; and as the Buddha after his journey to the West. In the first birth, his mother is known as “The Jade Maiden of Mystery and Wonder.” In his second, he is born to a human woman known as Mother Li. This was an eighty-one year pregnancy, after which he was born from her left armpit (there is a tradition that Buddha had been born from his mother’s right arm pit). At birth he had white hair and so he was called laozi (here meaning something more like lao haizi or Old Child). This birth is set in the time of the Shang dynasty, several centuries before the date Sima Qian reports. But the purpose of such a move in the Laozi legend is to allow him time to travel to the West and then become the Buddha. The third birth takes place in India as the Buddha.

In the Yuan dynasty (1285 C.E.), Emperor Shizu ordered the burning of the Daoist canon of texts, and according to lore, the first writing destroyed was the greatly extended version of Classic of the Conversion of the Barbarians in ten or more scrolls.  Once again, though, the text and its story of Laozi seemed quite resilient. It reappeared in the form of an illustrated work entitled Eighty-one Transformations of Lord Lao (Laojun bashiyi hua tushuo).  The Buddhist thinker Xiangmai wrote a detailed, but polemical, history of this text and few scholars trust its reliability.  Whether the Eighty-one Transformations of Lord Lao still survives is arguable, although a work entitled Eighty-one Transformations of the Most High Lord Lao of Mysterious Origin of the Golden Portal (Jinque xuanyuan Taishang Laojun baishiyi hua tushuo) with illustrations and dating to 1598 is held in the Museum fur Volkerkunde in Berlin.  The version in Berlin provides an illustration for each of Laozi’s transformations, each accompanied by a short text. The first few depict his existence in cosmic time.  It is not until the 11th transformation that he enters historical time during the era of Fu Xi by the name Yuhuazi.  In his 34th transformation, Laozi sends Yin Xi to explain the sutras to the Indian barbarians.  The 58th transformation is Laozi’s appearance in the clouds to Zhang Daoling, the founder of the Celestial Master Zhengyi sect of Daoism that still exists today.

Ge Hong’s (283-343 C.E.) The Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is arguably the most important Daoist philosophical work of the Jin dynasty. In this text, Ge Hong reports that in a state of visualization he saw Laozi, seven feet tall, with cloudlike garments of five colors, wearing a multi-tiered cap and carrying a sharp sword. According to Ge Hong, Laozi had a prominent nose, long eyebrows, and an elongated head. This physiological type was the template for portraying immortals in Daoist art.  Whereas Liu Xiang’s Collected Biographies of the Immortals (Liexian zhuan, c. 18 B.C.E.) reports that Laozi was born during the Shang dynasty, served as an archivist under the Zhou, was a teacher of Confucius, and later made his way to the West just as said in Sima Qian’s standard biography, Ge Hong also collected and edited the Biographies of Immortals (Shenxian zhuan).  According to the article on Laozi, Ge Hong praises Laozi’s practice of stillness and wu-wei, but he also represents Laozi as a master of the techniques of immortality and the efficacy of external alchemy, herbs and control of qi.  He attributes to Laozi what is called the alchemy of the nine cinnabars and eight minerals, as well as a vast knowledge of herbology and dietetics.  Ge Hong also tells a story about one Xu Jia who was a retainer of Laozi.  In the story, Laozi keeps Xu Jia alive by means of a powerful talisman placed in Xu’s mouth.  Its removal causes Xu’s death.  When replaced, Xu Jia lives again.  In all this, Laozi is portrayed as a master of life and death by means of talismanic power, a practice used by the Celestial Masters and continued by Daoist masters as late as the Ming dynasty, if not into the present era.

Other reported manifestations of Laozi gave authority to new Daoist lineages or modifications of practice.  For example, the Daoist master Kou Qianzhi reported a revelation received from Laozi in 415 C.E. which was a “New Code” for Daoist practitioners and communities.  He wrote down the revelation in a text that became known as Classic on Precepts of Lord Lao Recited to the Melody of the Clouds (Laojun yinsong jiejing).  This text contains 36 moral precepts each of which trace their authority to the introductory phrase, “Lord Lao said….”  Textual traces are not the only sources for the traditions and views of Laozi in Chinese philosophical history. Yoshiko Kamitsuka has done a study of how views about Laozi changed and been reflected in material culture, especially sculpture and inscription.

Laozi was also often looked to for political validation.  Throughout most of the Tang dynasty (618-907 C.E.), Laozi was regarded as the protector of the state because of the tradition that both the Tang ruling family and Laozi shared the surname Li and because of many reports of auspicious appearances of Laozi at the inauguration of the Tang dynasty in which he pledged his support during the rise and solidification of the ruling bureaucracy.

The hagiography of Laozi has continued to develop, down to the present day. There are even traditions that various natural geographic landmarks and features are the enduring imprint of Lord Lao on China and his face can be seen in them. It is more likely, of course, that Laozi’s immortality is in the mark made by the philosophical movement he has come to represent and the culture it created.

5. References and Further Reading

  • Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
  • Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
  • Boltz, William. (2005). “The Composite Nature of Early Chinese Texts.” In Text and Ritual in Early China, ed. Martin Kern. 50-78. Seattle: University of Washington Press.
  • Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
  • Giles, Lionel. (1948). A Gallery of Chinese Immortals. London: John Murray.
  • Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
  • Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
  • Graham, Angus. [1998 (1986)], “The Origins of the Legend of Lao Tan.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 23-41. Albany: State University of New York Press.
  • Hansen, Chad. (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
  • Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
  • Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
  • Kamitsuka, Yoshiko, (1998). “Lao-Tzu in Six Dynasties Taoist Sculpture.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 63-89. Albany: State University of New York Press.
  • Kim, Tae Hyun. (2010). “Other Laozi Parallels in the Hanfeizi An Alternative Approach to the Textual History of the Laozi and Early Chinese Thought.” Sino-Platonic Papers 199 (March 2010), ed. Victor H. Mair. Philadelphia: University of Pennsylvania Press.
  • Kohn, Livia (2008). “Laojun yinsong jiejing [Classic on Precepts of Lord Lao, Recited to the Melody in the Clouds].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Kohn, Livia, (1998). “The Lao-Tzu Myth.” In Lao-tzu and the Tao-te-ching, ed. Kohn, Livia Kohn and Michael LaFargue, 41-63. Albany: State University of New York Press.
  • Kohn, Livia, (1996). “Laozi: Ancient Philosopher, Master of Longevity, and Taoist God.” In Religions of China in Practice, ed. Donald S. Lopez, 52-63. Princeton: Princeton University Press.
  • Kohn, Livia and LaFargue, Michael. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
  • Kohn, Livia and Roth, Harold (2002) Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
  • Nylan, Michael and Csikzentmihalyi, Mark. (2003). “Constructing Lineages and Inventing Traditions through Exemplary Figures in Early China.” T’oung Pao 89: 1-41.
  • Penny, Benjamin (2008). “Laojun bashiyi huatu [Eighty-one Transformations of Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Penny, Benjamin (2008). “Laojun shuo yibai bashi jie [The 180 Precepts Spoken by Lord Lao].” In Encyclopedia of Taoism, ed. Fabrizio Pregadio. London: Routledge.
  • Smith, Kidder (2003). “Sima Tan and the Invention of Daoism, ‘Legalism,’ et cetera.” The Journal of Asian Studies 62.1: 129-156.
  • Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press.
  • Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
  • Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Daoist Philosophy

daoAlong with Confucianism, “Daoism” (sometimes called “Taoism“) is one of the two great indigenous philosophical traditions of China. As an English term, Daoism corresponds to both Daojia (“Dao family” or “school of the Dao”), an early Han dynasty (c. 100s B.C.E.) term which describes so-called “philosophical” texts and thinkers such as Laozi and Zhuangzi, and Daojiao (“teaching of the Dao”), which describes various so-called “religious” movements dating from the late Han dynasty (c. 100s C.E.) onward.  Thus, “Daoism” encompasses thought and practice that sometimes are viewed as “philosophical,” as “religious,” or as a combination of both.  While modern scholars, especially those in the West, have been preoccupied with classifying Daoist material as either “philosophical” or “religious,” historically Daoists themselves have been uninterested in such categories and dichotomies.  Instead, they have preferred to focus on understanding the nature of reality, increasing their longevity, ordering life morally, practicing rulership, and regulating consciousness and diet.  Fundamental Daoist ideas and concerns include wuwei (“effortless action”), ziran (“naturalness”), how to become a shengren (“sage”) or zhenren (“perfected person”), and the ineffable, mysterious Dao (“Way”) itself.

Table of Contents

  1. What is Daoism?
  2. Classical Sources for Our Understanding of Daoism
  3. Is Daoism a Philosophy or a Religion?
  4. The Daodejing
  5. Fundamental Concepts in the Daodejing
  6. The Zhuangzi
  7. Basic Concepts in the Zhuangzi
  8. Daoism and Confucianism
  9. Daoism in the Han
  10. Celestial Masters Daoism
  11. Neo-Daoism
  12. Shangqing and Lingbao Daoist Movements
  13. Tang Daoism
  14. The Three Teachings
  15. The “Destruction” of Daoism
  16. References and Further Reading

1. What is Daoism?

Strictly speaking there was no Daoism before the literati of the Han dynasty (c. 200 B.C.E.) tried to organize the writings and ideas that represented the major intellectual alternatives available. The name daojia, “Dao family” or “school of the dao” was a creation of the historian Sima Tan (d. 110 B.C.E.) in his Shi ji (Records of the Historian) written in the 2nd century B.C.E. and later completed by his son, Sima Qian (145-86 B.C.E.). In Sima Qian’s classification, the Daoists are listed as one of the Six Schools: Yin-Yang, Confucian, Mohist, Legalist, School of Names, and Daoists. So, Daoism was a retroactive grouping of ideas and writings which were already at least one to two centuries old, and which may or may not have been ancestral to various post-classical religious movements, all self-identified as daojiao (“teaching of the dao“), beginning with the reception of revelations from the deified Laozi by the Celestial Masters (Tianshi) lineage founder, Zhang Daoling, in 142 C.E.This article privileges the formative influence of early texts, such as the Daodejing and the Zhuangzi, but accepts contemporary Daoists’ assertion of continuity between classical and post-classical, “philosophical” and “religious” movements and texts.

2. Classical Sources for Our Understanding of Daoism

Daoism does not name a tradition constituted by a founding thinker, even though the common belief is that a teacher named Laozi originated the school and wrote its major work, called the Daodejing, also sometimes known as the Laozi. The tradition is also called “Lao-Zhuang” philosophy, referring to what are commonly regarded as its two classical and most influential texts: the Daodejing or Laozi (3rd Cn. B.C.E.) and the Zhuangzi (4th-3rd Cn. B.C.E.). However, various streams of thought and practice were passed along by masters (daoshi) before these texts were finalized. There are two major source issues to be considered when forming a position on the origins of Daoism. 1) What evidence is there for beliefs and practices later associated with the kind of Daoism  recognized by Sima Qian prior to the formation of the two classical texts? 2) What is the best reconstruction of the classical textual tradition upon which later Daoism was based?

With regard to the first question, Isabelle Robinet thinks that the classical texts are only the most lasting evidence of a movement she associates with a set of writings and practices associated with the Songs of Chu (Chuci), and that she identifies as the Chuci movement. This movement reflects a culture in which male and female masters variously called fangshi, daoshi, zhenren, or daoren practiced techniques of longevity and used diet and meditative stillness anto create a way of life that attracted disciples and resulted in wisdom teachings. While Robinet’s interpretation is controversial, there are undeniable connections between the Songs of Chu and later Daoist ideas. Some examples include a coincidence of names of immortals (sages), a commitment to the pursuit of physical immortality, a belief in the epistemic value of stillness and quietude, abstinence from grains, breathing and sexual practices used to regulate internal energy (qi), and the use of ritual dances that resemble those still done by Daoist masters (the step of Yu).

In addition to the controversial connection to the Songs of Chu, the Guanzi (350-250 B.C.E.) is a text older than both the Daodejing and probably all of the Zhuangzi, except the “inner chapters” (see below). The Guanzi  is a very important work of 76 “chapters.” Three of the chapters of the Guanzi are called the Neiye, a title which can mean “inner cultivation.” The self-cultivation practices and teachings put forward in this material may be fruitfully linked to several other important works: the Daodejing; the Zhuangzi; a Han dynasty Daoist work called the Huainanzi; and an early commentary on the Daodejing called the Xiang’er. Indeed, there is a strong meditative trend in the Daoism of late imperial China known as the “inner alchemy” tradition and the views of the Neiye seem to be in the background of this movement. Two other chapters of the Guanzi are called Xin shu (Heart-mind book). The Xin shu connects the ideas of quietude and stillness found in both the Daodejing and Zhuangzi to longevity practices. The idea of dao in these chapters is very much like that of the classical works. Its image of the sage resembles that of the Zhuangzi. It uses the same term (zheng) that Zhuangzi uses for the corrections a sage must make in his body, the pacification of the heart-mind, and the concentration and control of internal energy (qi). These practices are called “holding onto the One,” “keeping the One,” “obtaining the One,” all of which are phrases also associated with the Daodejing (chs. 10, 22, 39).

The Songs of Chu and Guanzi still represent texts which are themselves creations of actual practitioners of Daoist teachings and sentiments, just as do the Daodejing and Zhuangzi.  Who these persons were we do not know with certainty.  It is possible that we do have the names, remarks, and practices of some of these individuals (daoshi) embodied in the passages of the Zhuangzi. For example, in Chs. 1-7 alone, Xu You, Ch.1; Lianshu, Ch.1; Ziqi Ch. 2; Wang Ni, Ch. 2; Changwuzi, Ch. 2; Qu Boyu, Ch. 4; Carpenter Shi, Ch. 4; Bohun Wuren, Ch. 5; Nu Y, Ch. 6; Sizi, Yuzi, Lizi, Laizi, Ch. 6; Zi Sanghu, Meng Zifan, Zi Qinzan, Ch. 6; Yuzi and Sangzi, Ch. 6; Wang Ni and Putizi, Ch. 7; Jie Yu, Ch. 7; Lao Dan, Ch. 7; Huzi, Ch. 7).

As for a reasonable reconstruction of the textual tradition upon which Daoism is based, we should not try to think of this task so simply as determining the relationship between the Daodejing and the Zhuangzi, such as which text was first and which came later. These texts are composite. The Zhuangzi, for example, repeats in very similar form sayings and ideas  found in the Daodejing, especially in the essay composing Zhuangzi Chs. 8-10. However, we are not certain whether this means that whomever was the source of this material in the Zhuangzi knew the Daodejing and quoted it, or if they both drew from a common source, or even if the Daodejing in some way depended on the Zhuangzi. In fact, one theory about the legendary figure Laozi is that he was created first in the Zhuangzi and later became associated with the Daodejing. There are seventeen passages in which Laozi (a.k.a. Lao Dan) plays a role in the Zhuangzi and he is not mentioned by name in the Daodejing.

Based on what we know now, we could offer the following summary of the sources of early Daoism. Stage One: Zhuang Zhou’s “inner chapters” (chs. 1-7) of the Zhuangzi (c. 350 B.C.E.) and some components of the Guanzi, including perhaps both the Neiye and the Xin shu. Stage Two: The essay in Chs. 8-10 of the Zhuangzi  and some collections of material which represent versions of our final redaction of the Daodejing, as well as Chs. 17-28 of the Zhuangzi representing materials likely gathered by Zhuang Zhou’s disciples. Stage Three: the “Yellow Emperor” (Huang-Lao) manuscripts from Mawangdui and of the Zhuangzi (Chs. 11-19, and 22), and the text known as the Huainanzi (c. 139 B.C.E.).

3. Is Daoism a Philosophy or a Religion?

In the late 1970s Western and comparative philosophers began to point out that an important dimension of the historical context of Daoism was being overlooked because the previous generation of scholars had ignored or even disparaged connections between the classical texts and Daoist religious belief and practice not previously thought to have developed until the 2nd century C.E. We have to lay some of the responsibility for a prejudice against Daoism as a religion and the privileging of its earliest forms as a pure philosophy at the feet of the eminent translators and philosophers Wing-Tsit Chan and James Legge, who both spoke of Daoist religion as a degeneration of a pristine Daoist philosophy arising from the time of the Celestial Masters (see below) in the late Han period. Chan and Legge were instrumental architects in the West of the view that Daoist philosophy (daojia) and Daoist religion (daojiao) are entirely different traditions.

Actually, our interest in trying to separate philosophy and religion in Daoism is more revealing of the Western frame of reference we use than of Daoism itself. Daoist ideas fermented among master teachers who had a holistic view of life. These daoshi (Daoist masters) did not compartmentalize practices by which they sought to influence the forces of reality, increase their longevity, have interaction with realities not apparent to our normal way of seeing things, and order life morally and by rulership. They offered insights we might call philosophical aphorisms. But they also practid meditative stillness and emptiness to gain knowledge, engaged in physical exercises to increase the flow of inner energy (qi), studied nature for diet and remedy to foster longevity, practiced rituals related to their view that reality had many layers and forms with whom/which humans could interact, wrote talismans and practiced divination, engaged in spellbinding of “ghosts,” led small communities, and advised rulers on all these subjects. The masters transmitted their teachings, some of them only to disciples and adepts, but gradually these teachings became more widely available as is evidenced in the very creation of the Daodejing and Zhuangzi themselves.

The anti-supernaturalist and anti-dualist agendas that provoked Westerners to separate philosophy and religion, dating at least to the classical Greek period of philosophy was not part of the preoccupation of Daoists. Accordingly, the question whether Daoism is a philosophy or a religion is not one we can ask without imposing a set of understandings, presuppositions, and qualifications that do not apply to Daoism. But the hybrid nature of Daoism is not a reason to discount the importance of Daoist thought. Quite to the contrary, it may be one of the most significant ideas classical Daoism can contribute to the study of philosophy in the present age.

4. The Daodejing

The Daodejing (hereafter, DDJ) is divided into 81 “chapters” consisting of slightly over 5,000 Chinese characters, depending on which text is used. In its received form from Wang Bi (see below), the two major divisions of the text are the dao jing (chs. 1-37) and the de jing (chs. 38-81). Actually, this division probably rests on little else than the fact that the principal concept opening Chapter 1 is dao (way) and that of Chapter 38 is de (virtue). The text is a collection of short aphorisms that were not arranged to develop any systematic argument. The long standing tradition about the authorship of the text is that the “founder” of Daoism, known as Laozi gave it to Yin Xi, the guardian of the pass through the mountains that he used to go from China to the West (i.e., India) in some unknown date in the distant past. But the text is actually a composite of collected materials, most of which probably originally circulated orally perhaps even in single aphorisms or small collections. These were then redacted as someone might string pearls into a necklace. Although D.C. Lau and Michael LaFargue had made preliminary literary and redaction critical studies of the texts, these are still insufficient to generate any consensus about whether the text was composed using smaller written collections or who were the probable editors.

For almost 2,000 years, the Chinese text used by commentators in China and upon which all except the most recent Western language translations were based has been called the Wang Bi, after the commentator who used a complete edition of the DDJ sometime between 226-249 CE. Although Wang Bi was not a Daoist, his commentary became a standard interpretive guide, and generally speaking even today scholars depart from it only when they can make a compelling argument for doing so. Based on recent archaeological finds at Guodian in 1993 and Mawangdui in the 1970s we are certain that there were several simultaneously circulating versions of the Daodejing text as early as c. 300 B.C.E.

Mawangdui is the name for a site of tombs discovered near Changsha in Hunan province. The Mawangdui discoveries consist of two incomplete editions of the DDJ on silk scrolls (boshu) now simply called “A” and “B.” These versions have two principal differences from the Wang Bi. Some word choice divergencies are present. The order of the chapters is reversed, with 38-81 in the Wang Bi coming before chapters 1-37 in the Mawangdui versions. More precisely, the order of the Mawangdui texts takes the traditional 81 chapters and sets them out like this: 38, 39, 40, 42-66, 80, 81, 67-79, 1-21, 24, 22, 23, 25-37. Robert Henricks has published a translation of these texts with extensive notes and comparisons with the Wang Bi under the title Lao-Tzu, Te-tao Ching (1989). Contemporary scholarship associates the Mawangdui versions with a type of Daoism known as the Way of the Yellow Emperor and the Old Master (Huanglao Dao).

The Guodian find consists of 730 inscribed bamboo slips found near the village of Guodian in Hubei province in 1993. There are 71 slips with material that is also found in 31 of the 81 chapters of the DDJ and corresponding to Chapters 1-66. It may date as early as c. 300 B.C.E. If this is a correct date, then the Daodejing was already extant in a written form when the “inner chapters” (see below) of the Zhuangzi were composed. These slips contain more significant variants from the Wang Bi than do the Mawangdui versions. A complete translation and study of the Guodian cache has been published by Scott Cook (2013).

5. Fundamental Concepts in the Daodejing

The term Dao means a road, and is often translated as “the Way.” This is because sometimes dao is used as a nominative (that is, “the dao”) and other times as a verb (i.e. daoing). Dao is the process of reality itself, the way things come together, while still transforming. All this reflects the deep seated Chinese belief that change is the most basic character of things. In the Yi jing (Classic of Change) the patterns of this change are symbolized by figures standing for 64 relations of correlative forces and known as the hexagrams. Dao is the alteration of these forces, most often simply stated as yin and yang. The Xici is a commentary on the Yi jing formed in about the same period as the DDJ. It takes the taiji (Great Ultimate) as the source of correlative change and associates it with the dao. The contrast is not between what things are or that something is or is not, but between chaos (hundun) and the way reality is ordering (de). Yet, reality is not ordering into one unified whole. It is the 10,000 things (wanwu). There is the dao but not “the World” or “the cosmos” in a Western sense.

The Daodejing teaches that humans cannot fathom the Dao, because any name we give to it cannot capture it. It is beyond what we can express in language (ch.1). Those who experience oneness with dao, known as “obtaining dao,” will be enabled to wu-weiWu-wei is a difficult notion to translate. Yet, it is generally agreed that the traditional rendering of it as “nonaction” or “no action” is incorrect. Those who wu wei do act. Daoism is not a philosophy of “doing nothing.” Wu-wei means something like “act naturally,” “effortless action,” or “nonwillful action.” The point is that there is no need for human tampering with the flow of reality. Wu-wei should be our way of life, because the dao always benefits, it does not harm (ch. 81) The way of heaven (dao of tian) is always on the side of good (ch. 79) and virtue (de) comes forth from the dao alone (ch. 21). What causes this natural embedding of good and benefit in the dao is vague and elusive (ch. 35), not even the sages understand it (ch. 76). But the world is a reality that is filled with spiritual force, just as a sacred image used in religious ritual might be inhabited by numinal power (ch. 29). The dao occupies the place in reality that is analogous to the part of a family’s house set aside for the altar for venerating the ancestors and gods (the ao of the house, ch. 62). When we think that life’s occurrences seem unfair (a human discrimination), we should remember that heaven’s (tian) net misses nothing, it leaves nothing undone (ch. 37)

A central theme of the Daodejing is that correlatives are the expressions of the movement of dao. Correlatives in Chinese philosophy are not opposites, mutually excluding each other. They represent the ebb and flow of the forces of reality: yin/yang, male/female; excess/defect; leading/following; active/passive. As one approaches the fullness of yin, yang begins to horizon and emerge and vice versa. Its teachings on correlation often suggest to interpreters that the DDJ is filled with paradoxes. For example, ch. 22 says, “Those who are crooked will be perfected. Those who are bent will be straight. Those who are empty will be full.” While these appear paradoxical, they are probably better understood as correlational in meaning. The DDJ says, “straightforward words seem paradoxical,” implying, however, that they are not (ch. 78).

What is the image of the ideal person, the sage (sheng ren), or  the perfected person (zhen ren) in the DDJ? Well, sages wu-wei, (chs. 2, 63). They act effortlessly and spontaneously as one with dao and in so doing, they “virtue” (de) without deliberation or volitional challenge. In this respect, they are like newborn infants, who move naturally, without planning and reliance on the structures given to them by culture and society (ch. 15). The DDJ tells us that sages empty themselves, becoming void of the discriminations  used in conventional language and culture. Sages concentrate their internal energies (qi). They clean their vision (ch. 10). They manifest naturalness and plainness, becoming like uncarved wood (pu) (ch. 19). They live naturally and free from desires rooted in the discriminations that human society makes (ch. 37) They settle themselves and know how to be content (ch. 46). The DDJ makes use of some very famous analogies to drive home its point. Sages know the value of emptiness as illustrated by how emptiness is used in a bowl, door, window, valley or canyon (ch. 11). They preserve the female (yin), meaning that they know how to be receptive to dao and its power (de) and are not unbalanced favoring assertion and action (yang) (ch. 28). They shoulder yin and embrace yang, blend internal energies (qi) and thereby attain harmony (he) (ch. 42). Those following the dao do not strive, tamper, or seek to control their own lives (ch. 64). They do not endeavor to help life along (ch. 55), or use their heart-mind (xin) to “solve” or “figure out” life’s apparent knots and entanglements (ch. 55). Indeed, the DDJ cautions that those who would try to do something with the world will fail, they will actually ruin both themselves and the world (ch. 29). Sages do not engage in disputes and arguing, or try to prove their point (chs. 22, 81). They are pliable and supple, not rigid and resistive (chs. 76, 78). They are like water (ch. 8), finding their own place, overcoming the hard and strong by suppleness (ch. 36). Sages act with no expectation of reward (chs. 2, 51). They put themselves last and yet come first (ch. 7). They never make a display of themselves, (chs. 72, 22). They do not brag or boast, (chs. 22, 24) and they do not linger after their work is done (ch. 77). They leave no trace (ch. 27). Because they embody dao in practice, they have longevity (ch. 16). They create peace (ch. 32). Creatures do not harm them (chs. 50, 55). Soldiers do not kill them (ch. 50). Heaven (tian) protects the sage and the sage’s spirit becomes invincible (ch. 67).

Among the most controversial of the teachings in the DDJ are those directly associated with rulers. Recent scholarship is moving toward a consensus that the persons who developed and collected the teachings of the DDJ played some role in advising civil administration, but they may also have been practitioners of ritual arts and what we would call religious rites. Be that as it may, many of the aphorisms directed toward rulers in the DDJ seem puzzling at first sight. According to the DDJ, the proper ruler keeps the people without knowledge, (ch. 65), fills their bellies, opens their hearts and empties them of desires (ch. 3). A sagely ruler reduces the size of the state and keeps the population small. Even though the ruler possesses weapons, they are not used (ch. 80). The ruler does not seek prominence. The ruler is a shadowy presence, never standing out (chs. 17, 66). When the ruler’s work is done, the people say they are content (ch. 17). This picture of rulership in the DDJ is all the more interesting when we remember that the philosopher and legalist political theorist named Han Feizi used the DDJ as a guide for the unification of China. Han Feizi was the foremost counselor of the first emperor of China, Qin Shihuangdi (r. 221-206 B.C.E.). However, it is a pity that the emperor used the DDJ’s admonitions to “fill the bellies and empty the minds” of the people to justify his program of destroying all books not related to medicine, astronomy or agriculture. When the DDJ says that rulers keep the people without knowledge, it probably means that they do not encourage human knowledge as the highest form of knowing but rather they encourage the people to “obtain oneness with the dao.”

6. The Zhuangzi

The second of the two most important classical texts of Daoism is the Zhuangzi. This text is a collection of stories and remembered as well as imaginary conversations.  The text is well known for its creativity and skillful use of language. Within the text we find longer and shorter treatises, stories, poetry, and aphorisms. The Zhuangzi may date as early as the 4th century B.C.E. and according to imperial bibliographies of a later date, the Zhuangzi originally had 52 “chapters.” These were reduced to 33 by Guo Xiang in the 3rd century C.E., although he seems to have had the 52 chapter text available to him.  Ronnie Littlejohn has argued that the later work Liezi may contain some passages from the so-called “Lost Zhuangzi” 52 chapter version. Unlike the Daodejing which is ascribed to the mythological Laozi, the Zhuangzi may actually contain materials from a teacher known as Zhuang Zhou who lived between 370-300 B.C.E. Chapters 1-7 are those most often ascribed to Zhuangzi himself (which is a title meaning “Master Zhuang”) and these are known as the “inner chapters.” The remaining 26 chapters had other origins and they sometimes take different points of view from the Inner Chapters. Although there are several versions of how the remainder of the Zhuangzi may be divided, one that is gaining currency is Chs. 1-7 (Inner Chapters), Chs. 8-10 (the “Daode” essay), Chs. 11-16 and parts of 18, 19, and 22 (Yellow Emperor Chapters), and Chs. 17-28 (Zhuang Zhou’s Disciples’ material), with the remains of the text attributable to the final redactor.

7. Basic Concepts in the Zhuangzi

Zhuangzi taught that a set of practices, including meditative stillness, helped one achieve unity with the dao and become a “perfected person” (zhenren). The way to this state is not the result of a withdrawal from life. However, it does require disengaging or emptying oneself of conventional values and the demarcations made by society. In Chapter 23 of the Zhuangzi, aNanrong Chu inquiring of the character Laozi about the solution to his life’s worries was answered promptly: “Why did you come with all this crowd of people?” The man looked around and confirmed he was standing alone, but Laozi meant that his problems were the result of all the baggage of ideas and conventional opinions he lugged about with him. This baggage must be discarded before anyone can be zhenren, move in wu-wei and express profound virtue (de).

Like the DDJ, Zhuangzi also valorizes wu-wei, especially in the Inner Chapters, the Yellow Emperor sections on rulership, and the Zhuangzi disciples’ materials in Ch. 19. For its examples of such living the Zhuangzi turns to analogies of craftsmen, athletes (swimmers), ferrymen, cicada-catching men, woodcarvers, and even butchers. One of the most famous stories in the text is that of Ding the Butcher, who learned what it means to wu wei through the perfection of his craft. When asked about his great skill, Ding says, “What I care about is dao, which goes beyond skill. When I first began cutting up oxen, all I could see was the ox itself. After three years I no longer saw the whole ox. And now—now I go at it by spirit and don’t look with my eyes. Perception and understanding have come to a stop and spirit moves where it wants. I go along with the natural makeup, strike in the big hollows, guide the knife through the big openings, and follow things as they are. So I never touch the smallest ligament or tendon, much less a main joint. A good cook changes his knife once a year—because he cuts. A mediocre cook changes his knife once a month—because he hacks. I’ve had this knife of mine for nineteen years and I’ve cut up thousands of oxen with it, and yet the blade is as good as though it had just come from the grindstone. There are spaces between the joints, and the blade of the knife has really no thickness….[I] move the knife with the greatest subtlety, until—flop! The whole thing comes apart like a clod of earth crumbling to the ground.” (Ch. 3, The Secret of Caring for Life)  The recurring point of all of the stories in Zhuangzi about wu-wei is that such spontaneous and effortless conduct as displayed by these many examples has the same feel as acting in wu-wei.  The point is not that wu-wei results from skill development.  Wu-wei is not a cultivated skill. It is a gift of oneness with dao.  The Zhuangzi’s teachings on wu-wei are closely related to the text’s consistent rejection of the use of reason and argument as means to dao (chs. 2; 12, 17, 19).

Persons who exemplify such understanding are called sages, zhenren, and immortals. Zhuangzi describes the Daoist sage in such a way as to suggest that such a person possesses extraordinary powers. Just as the DDJ said that creatures do not harm the sages, the Zhuangzi also has a passage teaching that the zhenren exhibits wondrous powers, frees people from illness and is able to make the harvest plentiful (ch.1).  Zhenren are “spirit like” (shen yi), cannot be burned by fire, do not feel cold in the freezing forests, and life and death have no effect on them (ch. 2).  Just how we should take such remarks is not without controversy.  To be sure, many Daoist in history took them literally and an entire tradition of the transcendents or immortals (xian) was collected in text and lore.

Zhuangzi is drawing on a set of beliefs about master teachers that were probably regarded as literal by many, although some think he meant these to be taken metaphorically. For example, when Zhuangzi says that the sage cannot be harmed or made to suffer by anything that life presents, does he mean this to be taken as saying that the zhenren is physically invincible? Or, does he mean that the sage has so freed himself from all conventional understandings that he refuses to recognize poverty as any more or less desirable than affluence, to recognize blindness as worse than sight, to recognize death as any less desirable than life? As the Zhuangzi says in Chapter One, Free and Easy Wandering, “There is nothing that can harm this man.” This is also the theme of Chapter Two, On Making All Things Equal. In this chapter people are urged to “make all things one,” meaning that they should recognize that reality is one. It is a human judgment that what happens is beautiful or ugly, right or wrong, fortunate or not. The sage knows all things are one (equal) and does not judge. Our lives are snarled and jumbled so long as we make conventional discriminations, but when we set them aside, we appear to others as extraordinary and enchanted.

An important theme in the Zhuangzi is the use of immortals to illustrate various points. Did Zhuangzi believe some persons physically lived forever? Well, many Daoists did believe this. Did Zhuangzi believe that our substance was eternal and only our form changed? Almost certainly Zhuangzi thought that we were in a constant state of process, changing from one form into another (see the exchange between Master Lai and Master Li in Ch. 6, The Great and Venerable Teacher). In Daoism, immortality is the result of what may be described as a wu xing transformation. Wu xing means “five phases” and it refers to the Chinese understanding of reality according to which all things are in some state of combined correlation of qi as wood, fire, water, metal, and earth. This was not exclusively a “Daoist” physics. It underlay all Chinese “science” of the classical period, although Daoists certainly made use of it. Zhuangzi wants to teach us how to engage in transformation through stillness, breathing, and experience of numinal power (see ch. 6). And yet, perhaps Zhuangzi’s teachings on immortality mean that the person who is free of discrimination makes no difference between life and death. In the words of Lady Li in Ch. 2, “How do I know that the dead do not wonder why they ever longed for life?”

Huangdi (the Yellow Emperor) is the most prominent immortal mentioned in the text of the Zhuangzi and he is a main character in the sections of the book called “the Yellow Emperor Chapters” noted above. He has long been venerated in Chinese history as a cultural exemplar and the inventor of civilized human life. Daoism is filled with other accounts designed to show that those who learn to live according to the according to the dao have long lives. Pengzu, one of the characters in the Zhuangzi, is said to have lived eight hundred years. The most prominent female immortal is Xiwangmu (Queen Mother of the West), who was believed to reign over the sacred and mysterious Mount Kunlun.

The passages containing stories of the Yellow Emperor in Zhuangzi provide a window into the views of rulership in the text.  On the one hand, the Inner Chapters (chs. 1-7) reject the role of ruler as a viable vocation for a zhenren and consistently criticize the futility of government and politics (ch. 7).  On the other hand, the Yellow Emperor materials in Chs. 11-13 present rulership as valuable, so long as the ruler is acts by wu-wei.  This second position is also that taken in the work entitled the Huainanzi (see below).

The Daoists did not think of immortality as a gift from a god, or an achievement in the religious sense commonly thought of in the West. It was a result of finding harmony with the dao, expressed through wisdom, meditation, and wu-wei. Persons who had such knowledge were reputed to live in the mountains, thus the character for xian (immortal) is made up of two components, the one being shan “mountain” and the other being ren “person.” Undoubtedly, some removal to the mountains was a part of the journey to becoming a zhenren “true person.” Because Daoists believed that nature and our own bodies were correlations of each other, they even imagined their bodies as mountains inhabited by immortals. The struggle to wu-wei was an effort to become immortal, to be born anew, to grow the embryo of immortality inside. A part of the disciplines of Daoism included imitation of the animals of nature, because they were thought to act without the intention and willfulness that characterized human decision making. Physical exercises included animal dances (wu qin xi) and movements designed to enable the unrestricted flow of the cosmic life force from which all things are made (qi). These movements designed to channel the flow of qi became associated with what came to be called tai qi or qi gong. Daoists practiced breathing exercises, used herbs and other pharmacological substances, and they employed an instruction booklet for sexual positions and intercourse, all designed to enhance the flow of qi energy. They even practiced external alchemy, using burners to modify the composition of cinnabar into mercury and made potions to drink and pills to ingest for the purpose of adding longevity. Many Daoist practitioners died as a result of these alchemical substances, and even a few Emperors who followed their instructions lost their lives as well, Qinshihuang being the most famous.

The attitude and practices necessary to the pursuit of immortality made this life all the more significant. Butcher Ding is a master butcher because his qi is in harmony with the dao. Daoist practices were meant for everyone, regardless of their origin, gender, social position, or wealth. However, Daoism was a complete philosophy of life and not an easy way to learn.

When superior persons learn the Dao, they practice it with zest.

When average persons learn of the Dao, they are indifferent.

When petty persons learn of the Dao, they laugh loudly.

If they did not laugh, it would not be worthy of being the Dao.DDJ, 41

8. Daoism and Confucianism

Arguably, Daoism shared some emphases with classical Confucianism such as a this-worldly concern for the concrete details of life rather than speculation about abstractions and ideals. Nevertheless, it largely represented an alternative and critical tradition divergent from that of Confucius and his followers. While many of these criticisms are subtle, some seem very clear.

One of the most fundamental teachings of the DDJ is that human discriminations, such as those made in law, morality (good, bad) and aesthetics (beauty, ugly) actually create the troubles and problems  humans experience, they do not solve them (ch. 3a). The clear implication is that the person following the dao must cease ordering his life according to human-made distinctions (ch. 19). Indeed, it is only when the dao recedes in its influence that these demarcations emerge (chs. 18; 38), because they are a form of disease (ch. 74). In contrast, Daoists believe that the dao is untangling the knots of life, blunting the sharp edges of relationships and problems, and turning down the light on painful occurrences (ch. 4). So, it is best to practice wu-wei in all endeavors, to act naturally and not willfully try to oppose or tamper with how reality is moving or try to control it by human discriminations.

Confucius and his followers wanted to change the world and be proactive in setting things straight. They wanted to tamper, orchestrate, plan, educate, develop, and propose solutions. Daoists, on the other hand, take their hands off of life when Confucians want their fingerprints on everything. Imagine this comparison. If the Daoist goal is to become like a piece of unhewn and natural wood, the goal of the Confucians is to become a carved sculpture. The Daoists put the piece before us just as it is found in its naturalness, and the Confucians polish it, shape it, and decorate it. This line of criticism is made very explicitly in the essay which makes up Zhuangzi Chs. 8-10.

Confucians think they can engineer reality, understand it, name it, control it. But the Daoists think that such endeavors are the source of our frustration and fragmentation (DDJ, chs. 57, 72). They believe the Confucians create a gulf between humans and nature that weakens and destroys us. Indeed, as far as the Daoists are concerned, the Confucian project is like a cancer that saps our very life. This is a fundamental difference in how these two great philosophical traditions think persons should approach life, and as shown above it is a consistent difference found also between the Zhuangzi and Confucianism.

The Yellow Emperor sections of the Zhuangzi in Chs. 12, 13 and 14 contain five text blocks in which Laozi is portrayed in dialogue with Confucius and according to which he is pictured as Confucius’ master and teacher.  These materials provide a direct access into the Daoist criticism of the Confucian project.

9. Daoism in the Han

The teachings that were later called Daoism were closely associated with a stream of thought called Huanglao Dao (Yellow Emperor-Laozi Dao) in the 3rd and 2nd cn. B.C.E. The thought world transmitted in this stream is what Sima Tan meant by Daojia. The Huanglao school is best understood as a lineage of Daoist practitioners mostly residing in the state of Qi (modern Shandong area). Huangdi was the name for the Yellow Emperor, from whom the rulers of Qi said they were descended. When Emperor Wu, the sixth sovereign of the Han dynasty (r. 140-87 B.C.E.) elevated Confucianism to the status of the official state ideology and training in it became mandatory for all bureaucratic officials, the tension with Daoism became more evident. And yet, at court, people still sought longevity and looked to Daoist masters for the secrets necessary for achieving it. Wu continued to engage in many Daoist practices, including the use of alchemy, climbing sacred Taishan (Mt. Tai), and presenting talismanic petitions to heaven. Liu An, the Prince of Huainan and a nephew of Wu, is associated with the production of the work called the Masters of Huainan (Huainanzi, 180-122 B.C.E.).  This is a highly synthetic work formed at what is known as the Huainan academy and greatly influenced by Yellow Emperor Daoism.  John Major and a team of translators published the first complete English version of this text (2010).  The text was an attempt to merge cosmology, Confucian ideals, and a political theory using “quotes” attributed to the Yellow Emperor, although the statements actually parallel closely the Daodejing and the Zhuangzi. All this is of added significance because in the later Han work, Laozi binahua jing (Book of the Transformations of Laozi) the Chinese physics that persons and objects change forms was employed in order to identify Laozi with the Yellow Emperor.

10. Celestial Masters Daoism

Even though Emperor Wu forced Daoist practitioners from court, Daoist teachings found a fertile ground in which to grow in the environment of discontent with the policies of the Han rulers and bureaucrats. Popular uprisings sprouted. The Yellow Turban movement tried to overthrow Han imperial authority in the name of the Yellow Emperor and promised to establish the Way of Great Peace (Tai ping). Indeed, the basic moral and philosophical text that provided the intellectual justification of this movement was the Classic of Great Peace (Taiping jing), provided in an English version by Barbara Hendrischke. The present version of this work in the Daoist canon is a later and altered iteration of the original text dating about 166 CE and attributed to transnormal revelations experienced by Zhang Jiao.

Easily the most important of the Daoist trends at the end of the Han period was the wudou mi dao (Way of Five Bushels of Rice) movement, best known as the Way of the Celestial Masters (tianshi dao). This movement is traceable to a Daoist hermit named Zhang ling, also known as Zhang Daoling, who resided on a mountain near modern Chengdu in Sichuan. According to an account in Ge Hong’s Biographies of Spirit Immortals, Laozi appeared to Zhang (c. 142 CE) and gave him a commission to announce the soon end of the world and the coming age of Great Peace (taiping). The revelation said that those who followed Zhang would become part of the Orthodox One Covenant with the Powers of the Universe (Zhengyi meng wei). Zhang began the movement that culminated in a Celestial Master state. The administrators of this state were called libationers (ji jiu), because they performed religious rites, as well as political duties. They taught that personal illness and civil mishap were owing to the mismanagement of the forces of the body and nature. The libationers taught a strict form of morality and displayed registers of numinal powers they could access and control. Libationers were moral investigators, standing in for a greater celestial bureaucracy. The Celestial Master state developed against the background of the decline of the later Han dynasty. Indeed, when the empire finally decayed, the Celestial Master government was the only order in much of southern China.

When the Wei dynastic rulers became uncomfortable with the Celestial Masters’ power, they broke up the power centers of the movement. But this backfired because it actually served to disperse Celestial Masters followers throughout China. Many of the refugees settled near X’ian in and around the site of Louguan tai. The movement remained strong because its leaders had assembled a canon of texts [Statutory Texts of the One and Orthodox (Zhengyi fawen)]. This group of writings included philosophical, political, and ritual texts. It became a fundamental part of the later authorized Daoist canon.

11. Neo-Daoism

The resurgence of Daoism after the Han dynasty is often known as Neo-Daoism. As a result, Confucian scholars sought to annotate and reinterpret their own classical texts to move them toward greater compatibility with Daoism, and they even wrote commentaries on Daoist works.  A new type of Confucianism known simply as the Way of Mysterious Learning (Xuanxue) emerged. It is represented by a set of scholars, including some of the most prominent thinkers of the period: Wang Bi (226-249), He Yan (d. 249), Xiang Xiu (223?-300), Guo Xiang (d. 312) and Pei Wei (267-300).  In general, these scholars share in common an effort to reinterpret the social and moral understanding of Confucianism in ways to make it more compatible with Daoist philosophy. In fact, for many interpreters, the extent to which Daoist influence is evident in the texts of these writers has led some scholars to call this movement ‘Neo-Daoism.’ Wang Bi and Guo Xiang who wrote commentaries respectively on the Daodejing and the Zhuangzi, were the most important voices in this development. Traditionally, the famous “Seven Sages of the Bamboo Grove” (Zhulin qixian) have also been associated with the new Daoist way of life that expressed itself in culture and not merely in mountain retreats. These thinkers included landscape painters, calligraphers, poets, and musicians.

Among the philosophers of this period, the great representative of Daoism in southern China was Ge Hong (283-343 CE). He practiced not only philosophical reflection, but also external alchemy, manipulating mineral substances such as mercury and cinnabar in an effort to gain immortality. His work the Inner Chapters of the Master Who Embraces Simplicity (Baopuzi neipian) is the most important Daoist philosophical work of this period. For him, longevity and immortality are not the same, the former is only the first step to the latter.

12. Shangqing and Lingbao Daoist Movements

After the invasion of China by nomads from Central Asia, Daoists of the Celestial Master tradition who had been living in the north were forced to migrate into southern China, where Ge Hong’s version of Daoism was strong. The mixture of these two traditions is represented in the writings of the Xu family. The Xu family was an aristocratic group from what is today the city of Nanjing. Seeking Daoist philosophical wisdom and the long life it promised, many of them moved to Mao Shan Mountain, near the city. There they claimed to receive revelations from immortals, who dictated new wisdom and morality texts to them. Yang Xi was the most prominent medium recipient of the Maoshan revelations (360-370 CE). These revelations came from spirits who were local heroes named the Mao brothers, but they had been transformed into deities. Yang Xi’s writings formed the basis for Highest Purity (Shangqing) Daoism. The writings were extraordinarily well done and even the calligraphy in which they were written was beautiful.

The importance of these texts philosophically speaking is to be found in their idealization of the quest for immortality and transference of the material practices of the alchemical science of Ge Hong into a form of reflective meditation. In fact, the Shangqing school of Daoism is the beginning of the tradition known as “inner alchemy” (neidan), an individual mystical pursuit of wisdom.

Some thirty years after the Maoshan revelations, a descendent of Ge Hong, named Ge Chaofu went into a mediumistic trance and authored a set of texts called the Numinous Treasure (Lingbao) teachings. These works were ritual recitation texts similar to Buddhist sutras, and indeed they borrowed heavily from Buddhism. At first, the Shangqing and Lingbao texts belonged to the general stream of the Celestial Masters and were not considered separate sects or movements within Daoism, although later lineages of masters emphasized the uniqueness of their teachings.

13. Tang Daoism

As the Lingbao texts illustrate, Daoism acted as a receiving structure for Buddhism. Many early translators of Buddhist texts used Daoist terms to render Indian ideas. Some Buddhists saw Laozi as an avatar of Shakyamuni (the Buddha), and some Daoists understood Shakyamuni as a manifestation of the dao, which also means he was a manifestation of Laozi. An often made generalization is that Buddhism held north China in the 4th and 5th centuries, and Daoism the south. But gradually this intellectual currency actually reversed. Daoism grew in scope and impact throughout China.

By the time of the Tang dynasty (618-906 CE) Daoism was the intellectual philosophy that underwrote the national understanding. The imperial family claimed to descend from Li (by lore, the family of Laozi). Laozi was venerated by royal decree. Officials received Daoist initiation as Masters of its philosophy, rituals, and practices. A major center for Daoist studies was created at Dragon and Tiger Mountain (longhu shan), chosen both for its feng shui and because of its strategic location at the intersection of numerous southern China trade routes. The Celestial Masters who held leadership at Dragon and Tiger Mountain were later called “Daoist popes” by Christian missionaries because they had considerable political power.

In aesthetics, two great Daoist intellectuals worked during the Tang. Wu Daozi developed the rules for Daoist painting and Li Bai became its most famous poet. Interestingly, Daoist alchemists invented gunpowder during the Tang. The earliest block-print book on a scientific subject is a Daoist work entitled Xuanjie lu (850 CE). As Buddhism gradually grew stronger during the Tang, Daoist and Confucian intellectuals sought to initiate a conversation with it. The Buddhism that resulted was a reformed version known as Chan (Zen in Japan).

14. The Three Teachings

During the Five Dynasties (907-960 CE) and Song periods (960-1279 CE) Confucianism enjoyed a resurgence and Daoists found their place by teaching that principal thinkers of their tradition were Confucian scholars as well. Most notable among these was Lu Dongbin, a legendary Daoist immortal that many believed was originally a Confucian teacher.

Daoism became a complete philosophy of life, reaching into religion, social action, and individual health and physical well-being. A huge network of Daoist temples known by the name Dongyue Miao (also called tianqing guan) was created through the empire, with a miao in virtually every town of any size. The Daoist masters who served these temples were often appointed as government officials. They also gave medical, moral, and philosophical advice, and led religious rituals, dedicated especially to the Lord of the Sacred Mountain of the East named Taishan. Daoist masters had wide authority. All this was obvious in the temple iconography. Taishan was represented as the emperor, the City God (cheng huang) was a high official, and the Earth God was portrayed as a prosperous peasant. Daoism of this period integrated the Three Teachings (sanjiao) of China: Confucianism, Buddhism, and Daoism. This process of synthesis continued throughout the Song and into the period of the Ming Dynasty.

Such a wide dispersal of Daoist thought and practice, taken together with its interest in merging Confucianism and Buddhism, eventually created a fragmented ideology. Into this confusion came Wang Zhe (1113-1170 CE), the founder of Quanzhen (Complete Perfection) Daoism. It was Wang’s goal to bring the three teachings into a single great synthesis. For the first time, Daoist teachers adopted monastic forms of life, created monasteries, and organized themselves in ways they saw in Buddhism. This version of Daoist thought interpreted the classical texts of the DDJ and the Zhuangzi to call for a rejection of the body and material world. The Quanzhen order became powerful as the main partner of the Mongols (Yuan dynasty), who gave their patronage to its expansion. Less frequently, the Mongol emperors favored the Celestial Masters and their leader at Dragon and Tiger Mountain in an effort to undermine the power of the Quanzhen leaders. For example, the Zhengyi (Celestial Master) master of Beijing in the 1220s was Zhang Liusun. Under patronage he was allowed to build a Dongyue Miao in the city in 1223 and make it the unofficial town hall of the capital. But by the time of Khubilai Khan (r. 1260-1294) the Buddhists were used against all Daoists. The Khan ordered all Daoist books except the DDJ to be destroyed in 1281, and he closed the Quanzhen monastery in the city known as White Cloud Monastery (Baiyun Guan).

When the Ming (1368-1644) dynasty emerged, the Mongols were expulsed, and Chinese rule was restored. The emperors sponsored the creation of the first complete Daoist Canon (Daozang), which was edited between 1408 and 1445. This was an eclectic collection, including many Buddhist and Confucian related texts. Daoist influence reached its zenith.

15. The “Destruction” of Daoism

The Manchurian tribes that became rulers of China in 1644 and founded the Qing dynasty were already under the influence of conservative Confucian exiles. They stripped the Celestial Master of Dragon Tiger Mountain of his power at court. Only Quanzhen was tolerated. White Cloud Monastery (Baiyun Guan)) was reopened, and a new lineage of thinkers was organized. They called themselves the Dragon Gate lineage (Longmen pai). In the 1780s, the Western traders arrived, and so did Christian missionaries. In 1849, the Hakka people of Guangxi province, among China’s poorest citizens, rose in revolt. They followed Hong Xiuquan, who claimed to be Jesus’ younger brother. This millennial movement built on a strange version of Chinese Christianity sought to establish the Heavenly Kingdom of Peace (taiping). As the Taiping swept throughout southern China, they destroyed Buddhist and Daoist temples and texts wherever they found them. The Taiping army completely raised the Daoist complexes on Dragon Tiger Mountain. During most of the 20th century the drive to eradicate Daoist influence has continued. In the 1920s, the “New Life” movement drafted students to go out on Sundays to destroy Daoist statues and texts. Accordingly, by the year 1926 only two copies of the Daoist Canon (Daozang) existed and Daoist philosophical heritage was in great jeopardy. But permission was granted to copy the canon kept at the White Cloud Monastery, and so the texts were preserved for the world. There are 1120 titles in this collection in 5,305 volumes. Much of this material has yet to receive scholarly attention and very little of it has been translated into any Western language.

The Cultural Revolution (1966-1976) attempted to complete the destruction of Daoism. Masters were killed or “re-educated.” Entire lineages were broken up and their texts were destroyed. The miaos were closed, burned, and turned into military barracks. At one time, there were 300 Daoist sites in Beijing alone, now there are only a handful. However, Daoism is not dead. It survives as a vibrant philosophical system and way of life as is evidenced by the revival of its practice and study in several new University institutes in the People’s Republic.

16. References and Further Reading

  • Ames, Roger and Hall, David. (2003). Daodejing: “Making This Life Significant” A Philosophical Translation. New York: Ballantine Books.
  • Ames, Roger. (1998). Wandering at Ease in the Zhuangzi. Albany: State University of New York Press.
  • Bokenkamp, Stephen R. (1997). Early Daoist Scriptures. Berkeley: University of California Press.
  • Boltz, Judith M. (1987). A Survey of Taoist Literature: Tenth to Seventeenth Centuries, China Research Monograph 32. Berkeley: University of California Press.
  • Chan, Alan. (1991). Two Visions of the Way: A Translation and Study of the Heshanggong and Wang Bi Commentaries on the Laozi. Albany: State University of New York Press.
  • Cook, Scott (2013). The Bamboo Texts of the Guodian: A Study & Complete Translation. New York: Cornell University East Asia Program.
  • Coutinho, Steve (2014). An Introduction to Daoist Philosophies.  New York: Columbia University Press.
  • Creel, Herrlee G. (1970). What is Taoism? Chicago: University of Chicago Press.
  • Csikszentmihalyi, Mark and Ivanhoe, Philip J., eds. (1999). Religious and Philosophical Aspects of the Laozi. Albany: State University of New York.
  • Girardot, Norman J. (1983). Myth and Meaning in Early Taoism: The Theme of Chaos (hun-tun). Berkeley: University of California Press.
  • Graham, Angus. (1981). Chuang tzu: The Inner Chapters. London: Allen & Unwin.
  • Graham, Angus. (1989). Disputers of the Tao: Philosophical Argument in Ancient China. La Salle, IL: Open Court.
  • Graham, Angus. (1979). “How much of the Chuang-tzu Did Chuang-tzu Write?” Journal of the American Academy of Religion, Vol. 47, No. 3.
  • Hansen, Chad (1992). A Daoist Theory of Chinese Thought. New York: Oxford University Press.
  • Hendrischke, Barbara (2015, reprint ed.). The Scripture on Great Peace: The Taiping jing and the Beginnings of Daoism. Berkeley: The University of California Press.
  • Henricks, Robert. (1989). Lao-Tzu: Te-Tao Ching. New York: Ballantine.
  • Hochsmann, Hyun and Yang Guorong, trans. (2007). Zhuangzi. New York: Pearson.
  • Ivanhoe, Philip J. (2002). The Daodejing of Laozi. New York: Seven Bridges Press.
  • Kjellberg, Paul and Ivanhoe, Philip J., eds. (1996) Essays on Skepticism, Relativism, and Ethics in the Zhuangzi. Albany: State University of New York.
  • Kleeman, Terry (1998). Great Perfection: Religion and Ethnicity in a Chinese Millenial Kingdom. Honolulu: University of Hawaii Press.
  • Kohn, Livia, ed. (2004). Daoism Handbook, 2 vols. Boston: Brill.
  • Kohn, Livia (2009). Introducing Daoism. London: Routledge.
  • Kohn, Livia (2014). Zhuangzi: Text and Context.  St. Petersburg: Three Pines Press.
  • Kohn, Livia and LaFargue, Michael., eds. (1998). Lao-tzu and the Tao-te-ching. Albany: State University of New York Press.
  • Kohn, Livia and Roth, Harold., eds. (2002). Daoist Identity: History, Lineage, and Ritual. Honolulu: University of Hawaii Press.
  • Komjathy, Louis (2014). Daoism: A Guide for the Perplexed. London: Bloomsbury.
  • LaFargue, Michael. (1992). The Tao of the Tao-te-ching. Albany: State University of New York Press.
  • Lin, Paul J. (1977). A Translation of Lao-tzu’s Tao-te-ching and Wang Pi’s Commentary. Ann Arbor: University of Michigan.
  • Lau, D.C. (1982). Chinese Classics: Tao Te Ching. Hong Kong: Hong Kong University Press.
  • Littlejohn, Ronnie (2010). Daoism: An Introduction. London: I.B. Tauris.
  • Littlejohn, Ronnie (2011). “The Liezi’s Use of the Lost Zhuangzi.” Riding the Wind with Liezi: New Perspectives on the Daoist Classic. Eds. Ronnie Littlejohn and Jeffrey Dippmann. Albany: State University of New York.
  • Lynn, Richard John. (1999). The Classic of the Way and Virtue: A New Translation of the Tao-Te Ching of Laozi as Interpreted by Wang Bi. New York: Columbia University Press.
  • Mair, Victor, ed. (2010). Experimental Essays on Zhuangzi. St. Petersburg: Three Pines Press. New edition of University of Hawai’i, 1983.
  • Mair, Victor. (1990). Tao Te Ching: The Classic Book of Integrity and the Way. New York: Bantam Press.
  • Mair, Victor (1994). Wandering on the Way: Early Taoist Tales and Parables of Chuang Tzu. Honolulu: University of Hawai’i Press.
  • Major, John, Queen, Sarah, Set Meyer, Andrew, and Roth, Harold, trans. (2010). The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia University Press.
  • Maspero, Henri. (1981). Taoism and Chinese Religion. Amherst: University of Massachusetts Press.
  • Miller, James (2003). Daoism: A Short Introduction.  Oxford: Oxford University Press.
  • Moeller, Hans-Georg (2004). Daoism Explained: From the Dream of the Butterfly to the Fishnet Allegory. Chicago: Open Court.
  • Robinet, Isabelle. (1997). Taoism: Growth of a Religion. Stanford: Stanford University Press.
  • Roth, Harold (1999). Original Tao: Inward Training (Nei-yeh) and the Foundations of Taoist Mysticism. New York: Columbia University Press.
  • Roth, Harold D. (1992). The Textual History of the Huai Nanzi. Ann Arbor: Association of Asian Studies.
  • Roth, Harold D. (1991). “Who Compiled the Chuang Tzu?” In Chinese Texts and Philosophical Contexts, ed. Henry Rosemont, 84-95. La Salle: Open Court.
  • Schipper, Kristofer. (1993). The Taoist Body Berkeley: University of California Press.
  • Slingerland, Edward, (2003). Effortless Action: Wu-Wei As Conceptual Metaphor and Spiritual Ideal in Early China. New York: Oxford University Press.
  • Waley, Arthur (1934). The Way and Its Power: A Study of the Tao Te Ching and its Place in Chinese Thought. London: Allen & Unwin
  • Watson, Burton. (1968). The Complete Works of Chuang Tzu. New York: Columbia University Press
  • Welch, Holmes. (1966). Taoism: The Parting of the Way. Boston: Beacon Press.
  • Welch, Holmes and Seidel, Anna, eds. (1979). Facets of Taoism. New Haven: Yale University Press.

 

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Slavoj Žižek (1949 —)

philosopher, portraitSlavoj Žižek is a Slovenian-born political philosopher and cultural critic. He was described by British literary theorist, Terry Eagleton, as the “most formidably brilliant” recent theorist to have emerged from Continental Europe.

Žižek’s work is infamously idiosyncratic. It features striking dialectical reversals of received common sense; a ubiquitous sense of humor; a patented disrespect towards the modern distinction between high and low culture; and the examination of examples taken from the most diverse cultural and political fields. Yet Žižek’s work, as he warns us, has a very serious philosophical content and intention. He challenges many of the founding assumptions of today’s left-liberal academy, including the elevation of difference or otherness to ends in themselves, the reading of the Western Enlightenment as implicitly totalitarian, and the pervasive skepticism towards any context-transcendent notions of truth or the good.

One feature of Žižek’s work is its singular philosophical and political reconsideration of German idealism (Kant, Schelling and Hegel). Žižek has also reinvigorated the challenging psychoanalytic theory of Jacques Lacan, controversially reading him as a thinker who carries forward founding modernist commitments to the Cartesian subject and the liberating potential of self-reflective agency, if not self-transparency. Žižek’s works since 1997 have become more and more explicitly political, contesting the widespread consensus that we live in a post-ideological or post-political world, and defending the possibility of lasting changes to the new world order of globalization, the end of history, or the war on terror.

This article explains Žižek’s philosophy as a systematic, if unusually presented, whole; and it clarifies the technical language Žižek uses, which he takes from Lacanian psychoanalysis, Marxism, and German idealism. In line with how Žižek presents his own work, this article starts by examining Žižek’s descriptive political philosophy. It then examines the Lacanian-Hegelian ontology that underlies Žižek’s political philosophy. The final part addresses Žižek’s practical philosophy, and the ethical philosophy he draws from this ontology.

Table of Contents

  1. Biography
  2. Žižek’s Political Philosophy
  3. Criticism of Ideology as “False Consciousness”
  4. Ideological Cynicism and Belief
  5. Jouissance as Political Factor
  6. The Reflective Logic of Ideological Judgments (or How the King is King)
  7. Sublime Objects of Ideology
  8. Žižek’s Fundamental Ontology
  9. The Fundamental Fantasy & the Split Law
  10. Excursus: Žižek’s Typology of Ideological Regimes
  11. Kettle Logic, or Desire and Theodicy
  12. Fantasy as the Fantasy of Origins
  13. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)
  14. From Ontology to Ethics—Žižek’s Reclaiming of the Subject
  15. Žižek’s Subject, Fantasy, and the Objet Petit a
  16. The Objet Petit a & the Virtuality of Reality
  17. Forced Choice & Ideological Tautologies
  18. The Substance is Subject, the Other Does Not Exist
  19. The Ethical Act Traversing the Fantasy
  20. Conclusion
  21. References and Further Reading
    1. Primary Literature (Books by Žižek)
    2. Secondary Literature (Texts on Žižek)

1. Biography

Slavoj Žižek was born in 1949 in Ljubljana, Slovenia. He grew up in the comparative cultural freedom of the former Yugoslavia’s self-managing socialism. Here—significantly for his work— Žižek was exposed to the films, popular culture and theory of the noncommunist West. Žižek completed his PhD at Ljubljana in 1981 on German Idealism, and between 1981 and 1985 studied in Paris under Jacques AlainMiller, Lacan’s son-in-law. In this period, Žižek wrote a second dissertation, a Lacanian reading of Hegel, Marx and Kripke. In the late 1980s, Žižek returned to Slovenia where he wrote newspaper columns for the Slovenian weekly “Mladina,” and cofounded the Slovenian Liberal Democratic Party. In 1990, he ran for a seat on the four-member collective Slovenian presidency, narrowly missing office. Žižek’s first published book in English, The Sublime Object of Ideology, appeared in 1989. Since then, Žižek has published over a dozen books, edited several collections, published numerous philosophical and political articles, and maintained a tireless speaking schedule. His earlier works are of the type “Introductions to Lacan through popular culture / Hitchcock / Hollywood …” Since at least 1997, however, Žižek’s work has taken on an increasingly engaged political tenor, culminating in books on September 11 and the Iraq war. As well as being visiting professor at the Department of Psychoanalysis, Universite ParisVIII in 1982-3 and 1985-6, Žižek has lectured at the Cardozo Law School, Columbia, Princeton, the New School for Social Research, the University of Michigan, Ann Arbor, and Georgetown. He is currently a returning faculty member of the European Graduate School, and founder and president of the Society for Theoretical Psychoanalysis, Ljubljana.

2. Žižek’s Political Philosophy

a. Criticism of Ideology as “False Consciousness”

In a way that is oddly reminiscent of Nietzsche, Žižek generally presents his work in a polemical fashion, knowingly striking out against the grain of accepted opinion. One untimely feature of Žižek’s work is his continuing defense and use of the unfashionable term “ideology.” According to the classical Marxist definition, ideologies are discourses that promote false ideas (or “false consciousness”) in subjects about the political regimes they live in. Nevertheless, because these ideas are believed by the subjects to be true, they assist in the reproduction of the existing status quo, in an exact instance of what Umberto Eco dubs “the force of the fake.” To critique ideology, according to this position, it is sufficient to unearth the truth(s) the ideologies conceal from the subject’s knowledge. Then, so the theory runs, subjects will become aware of the political shortcomings of their current regimes, and be able and moved to better them. As Žižek takes up in his earlier works, this classical Marxian notion of ideology has come under theoretical attack in a number of ways. First, to criticize a discourse as ideological implies access to a Truth about political things the Truth that the ideologies, as false, would conceal. But it has been widely disputed in the humanities that there could ever be any One such theoretically accessible Truth. Secondly, the notion of ideology is held to be irrelevant to describe contemporary sociopolitical life, because of the increased importance of what Jurgen Habermas calls “mediasteered subsystems” (the market, public and private bureaucracies), and also because of the widespread cynicism of today’s subjects towards political authorities. For ideologies to have political importance, critics comment, subjects would have to have a level of faith in public institutions, ideals and politicians which today’s liberal-cosmopolitan subjects lack. The widespread notoriety of left-leaning authors like Michael Moore of Noam Chomsky, as one example, bears witness to how subjects today can know very well what Moore claims is the “awful truth,” and yet act as if they did not know.

Žižek agrees with critics about this “false consciousness” model of ideology. Yet he insists that we are not living in a post-ideological world, as figures as different as Tony Blair, Daniel Bell or Richard Rorty have claimed. Žižek proposes instead that in order to understand today’s politics we need a different notion of ideology. In a typically bold reversal, Žižek’s position is that today’s widespread consensus that our world is post-ideological gives voice to what he calls the “archideological” fantasy. Since “ideology” since Marx has carried a pejorative sense, no one who taken in by such an ideology has ever believed that they were so duped, Žižek comments. If the term “ideology” has any meaning at all, ideological positions are always what people impute to Others (for today’s left, for example, the political right are the dupes of one or another noble lie about natural community; for the right, the left are the dupes of well-meaning but utopian egalitarianism bound to lead to economic and moral collapse, and so forth). For subjects to believe in an ideology, it must have been presented to them, and been accepted, as non-ideological indeed, as True and Right, and what anyone sensible would believe. As we shall see in 2e, Žižek is alert to the realist insight that there is no more effective political gesture than to declare some contestable matter above political contestation. Just as the third way is said to be post-ideological or national security is claimed to be extra-political, so Žižek argues that ideologies are always presented by their proponents as being discourses about Things too sacred to profane by politics. Hence, Žižek’s bold opening in The Sublime Object of Ideology is to claim that today ideology has not so much disappeared from the political landscape as come into its own. It is exactly because of this success, Žižek argues, that ideology has also been able to be dismissed in accepted political and theoretical opinion.

b. Ideological Cynicism and Belief

Today’s typical first world subjects, according to Žižek, are the dupes of what he calls “ideological cynicism.” Drawing on the German political theorist Sloterdijk, Žižek contends that the formula describing the operation of ideology today is not “they do not know it, but they are doing it”, as it was for Marx. It is “they know it, but they are doing it anyway.” If this looks like nonsense from the classical Marxist perspective, Žižek’s position is that nevertheless this cynicism indicates the deeper efficacy of political ideology per se. Ideologies, as political discourses, are there to secure the voluntary consent—or what La Boétie called servitude volontaire of people about contestable political policies or arrangements. Yet, Žižek argues, subjects will only voluntarily agree to follow one or other such arrangement if they believe that, in doing so, they are expressing their free subjectivity, and might have done otherwise.

However false such a sense of freedom is, Žižek insists that it is nevertheless a political instance of what Hegel called an essential appearance. Althusser’s understanding of ideological identification suggests that an individual is wholly “interpellated” into a place within a political system by the system’s dominant ideology and ideological state apparatuses. Contesting this notion by drawing on Lacanian psychoanalysis, however, Žižek argues that it is a mistake to think that, for a political position to win peoples’ support, it needs to effectively brainwash them into thoughtless automatons. Rather, Žižek maintains that any successful political ideology always allows subjects to have and to cherish a conscious distance towards its explicit ideals and prescriptions—or what he calls, in a further technical term, “ideological disidentification.”

Again bringing the psychoanalytic theory of Lacan to bear in political theory, Žižek argues that the attitude of subjects towards authority revealed by today’s ideological cynicism resembles the fetishist’s attitude towards his fetish. The fetishist’s attitude towards his fetish has the peculiar form of a disavowal: “I know well that (for example) the shoe is only a shoe, but nevertheless, I still need my partner to wear the shoe in order to enjoy.” According to Žižek, the attitude of political subjects towards political authority evinces the same logical form: “I know well that (for example) Bob Hawke / Bill Clinton / the Party / the market does not always act justly, but I still act as though I did not know that this is the case.” In Althusser’s famous “Ideology and Ideological State Apparatuses,” Althusser staged a kind of primal scene of ideology, the moment when a policeman (as bearer of authority) says “hey you!” to an individual, and the individual recognizes himself as the addressee of this call. In the “180 degree turn” of the individual towards this Other who has addressed him, the individual becomes a political subject, Althusser says. Žižek’s central technical notion of the “big Other” [grand Autre] closely resembles—to the extent that it is not modelled on Althusser’s notion of the Subject (capital “S”) in the name of which public authorities (like the police) can legitimately call subjects to account within a regime—for example, “God” in a theocracy, “the Party” under Stalinism, or “the People” in today’s China. As the central chapter of The Sublime Object of Ideology specifies, ideologies for Žižek work to identify individuals with such important or rallying political terms as these, which Žižek calls “master signifiers.” The strange but decisive thing about these pivotal political words, according to Žižek, is that no one knows exactly what they mean or refer to, or has ever seen with their own eyes the sacred objects which they seem to name (for example: God, the Nation, or the People). This is one reason why Žižek, in the technical language he inherits (via Lacan) from structuralism, says that the most important words in any political doctrine are “signifiers without a signified” (that is, words that do not refer to any clear and distinct concept or demonstrable object).

This claim of Žižek’s is connected to two other central ideas in his work:

  • First: Žižek adapts the psychoanalytic notion that individuals are always “split” subjects, divided between the levels of their conscious awareness and the unconscious. Žižek contends throughout his work that subjects are always divided between what they consciously know and can say about political things, and a set of more or less unconscious beliefs they hold concerning individuals in authority, and the regime in which they live (see 3a). Even if people cannot say clearly and distinctly why they support some political leader or policy, for Žižek no less than for Edmund Burke, this fact is not politically decisive, as we will see in 2e below.
  • Second: Žižek makes a crucial distinction between knowledge and belief. Exactly where and because subjects do not know, for example, what “the essence” of “their people” is, the scope and nature of their beliefs on such matters is politically decisive, according to Žižek (again, see 2e below).

Žižek’s understanding of political belief is modelled on Lacan’s understanding of transference in psychoanalysis. The belief or “supposition” of the analysand in psychoanalysis is that the Other (his analyst) knows the meaning of his symptoms. This is obviously a false belief, at the start of the analytic process. But it is only through holding this false belief about the analyst that the work of analysis can proceed, and the transferential belief can become true (when the analyst does become able to interpret the symptoms). Žižek argues that this strange intersubjective or dialectical logic of belief in clinical psychoanalysis also what characterizes peoples’ political beliefs. Belief is always “belief through the Other,” Žižek argues. If subjects do not know the exact meaning of those “master signifiers” with which they political identify, this is because their political belief is mediated through their identifications with others. Although they each themselves “do not know what they do” (which is the title one of Žižek’s books [Žižek, 2002]), the deepest level of their belief is maintained through the belief that nevertheless there are Others who do know. A number of features of political life are cast into new relief given this psychoanalytic understanding, Žižek claims:

  • First, Žižek contends that the key political function of holders of public office is to occupy the place of what he calls, after Lacan, “the Other supposed to know.” Žižek cites the example of priests reciting mass in Latin before an uncomprehending laity, who believe that the priests know the meaning of the words, and for whom this is sufficient to keep the faith. Far from presenting an exception to the way political authority works, for Žižek this scenario reveals the universal rule of how political consensus is formed.
  • Second, and in connection with this, Žižek contends that political power is primarily “symbolic” in its nature. What he means by this further technical term is that the roles, masks, or mandates that public authorities bear is more important politically than the true “reality” of the individuals in question (whether they are unintelligent, unfaithful to their wives, good family women, and soforth). According to Žižek, for example, fashionable liberal criticisms of George W. Bush the man are irrelevant to understanding or evaluating his political power. It is the office or place an individual occupies in their political system (or “big Other”) that ensures the political force of their words, and the belief of subjects in their authority. This is why Žižek maintains that the resort of a political leader or regime to “the real of violence” (such as war or police action) amounts to a confession of its weakness as a political regime. Žižek sometimes puts this thought by saying that people believe through the big Other, or that the big Other believes for them, despite what they might inwardly think or cynically say.

c. Jouissance as Political Factor

A further key point that Žižek takes from Louis Althusser’s later work on ideology is Althusser’s emphasis on the “materiality” of ideology, its embodiment in institutions and peoples’ everyday practices and lives. Žižek’s realist position is that all the ideas in the world can have no lasting political effect unless they come to inform institutions and subjects’ day-to-day lives. In The Sublime Object of Ideology, Žižek cites Blaise Pascal’s advice that doubting subjects should get down on their knees and pray, and then they will believe. Pascal’s position is not any kind of simple proto-behaviorism, according to Žižek. The deeper message of Pascal’s directive, he asserts, is to suggest that once subjects have come to believe through praying, they will also retrospectively see that they got down on their knees because they always believed, without knowing it. In this way, in fact, Žižek can be read as a consistent critic not only of the importance of knowledge in the formation of political consensus, but also of the importance of “inwardness” in politics per se in the tradition of the younger Carl Schmitt.

Prior political philosophy has placed too little emphasis, Žižek asserts, on communities’ cultural practices that involve what he calls “inherent transgression.” These are practices sanctioned by a culture that nevertheless allow subjects some experience of what is usually exceptional to or prohibited in their everyday lives as civilized political subjects—things like sex, death, defecation, or violence. Such experiences involve what Žižek calls jouissance, another technical term he takes from Lacanian psychoanalysis. Jouissance is usually translated from the French as “enjoyment.” As opposed to what we talk of in English as “pleasure”, though, jouissance is an always sexualized, always transgressive enjoyment, at the limits of what subjects can experience or talk about in public. Žižek argues that subjects’ experiences of the events and practices wherein their political culture organizes its specific relations to jouissance (in first world nations, for example, specific sports, types of alcohol or drugs, music, festivals, films) are as close as they will get to knowing the deeper Truth intimated for them by their regime’s master signifiers: “nation”, “God”, “our way of life,” and so forth (see 2b above). Žižek, like Burke, argues that it is such ostensibly nonpolitical and culturally specific practices as these that irreplaceably single out any political community from its others and enemies. Or, as one of Žižek’s chapter titles in Tarrying With the Negative puts it, where and although subjects do not know their Nation, they “enjoy (jouis) their nation as themselves.”

d. The Reflective Logic of Ideological Judgments (or How the King is King)

According to Žižek, like and after Althusser, ideologies are thus political discourses whose primary function is not to make correct theoretical statements about political reality (as Marx’s “false consciousness” model implies), but to orient subjects’ lived relations to and within this reality. If a political ideology’s descriptive propositions turn out to be true (for example: “capitalism exploits the workers,” “Saddam was a dictator,” “the Spanish are the national enemy,” and so forth), this does not in any way reduce their ideological character, in Žižek’s estimation. This is because this character concerns the political issue of how subjects’ belief in these propositions, instead of those of opponents, positions subjects on the leading political issues of the day. For Žižek, political speech is primarily about securing a lived sense of unity or community between subjects, something like what Kant called sensus communis or Rousseau the general will. If political propositions seemingly do describe things in the world, Žižek’s position is that we nevertheless need always to understand them as Marx understood the exchange value of commodities—as “a relation between people being concealed behind a relation between things.” Or again: just as Kant thought that the proposition “this is beautiful” really expresses a subject’s reflective sense of commonality with all other subjects capable of being similarly affected by the object, so Žižek argues that propositions like “Go Spain!” or “the King will never stop working to secure our future” are what Kant called reflective judgments, which tell us as much or more about the subject’s lived relation to political reality as about this reality itself.

If ideological statements are thus performative utterances that produce political effects by their being stated, Žižek in fact holds that they are a strange species of performative utterance overlooked by speech act theory. Just because, when subjects say “the Queen is the Queen!” they are at one level reaffirming their allegiance to a political regime, Žižek at the same time holds that this does not mean that this regime could survive without appearing to rest on such deeper Truths about the way the world is. As we saw in 2b, Žižek maintains that political ideologies always present themselves as naming such deeper, extra-political Truths. Ideological judgments, according to Žižek, are thus performative utterances which, in order to perform their salutary political work, must yet appear to be objective descriptions of the way the world is (exactly as when a chairman says “this meeting is closed!” only thereby bringing this state of affairs into effect). In Sublime Object of Ideology, Žižek cites Marx’s analysis of being a King in Das Capital to illustrate his meaning. A King is only King because his subjects loyally think and act like he is King (think of the tragedy of Lear). Yet, at the same time, the people will only believe he is King if they believe that this is a deeper Truth about which they can do nothing.

e. Sublime Objects of Ideology

In line with Žižek’s ideas of “ideological disidentification” and “jouissance as a political factor” (see 2b and 2c above) and in a clear comparison with Derrida’s deconstruction, arguably the unifying thought in Žižek’s political philosophy is that regimes can only secure a sense of collective identity if their governing ideologies afford subjects an understanding of how their regime relates to what exceeds, supplements or challenges its identity. This is why Kant’s analytic of the sublime in The Critique of Judgment, as an analysis of an experience in which the subject’s identity is challenged, is of the highest theoretical interest for Žižek. Kant’s analytic of the sublime isolates two moments to its experience, as Žižek observes. In the first moment, the size or force of an object painfully impresses upon the subject the limitation of its perceptual capabilities. In a second moment, however, a “representation” arises where “we would least expect it,” which takes as its object the subject’s own failure to perceptually take the object in. This representation resignifies the subject’s perceptual failure as indirect testimony about the inadequacy of human perception as such to attain to what Kant calls Ideas of Reason (in Kant’s system, God, the Universe as a Whole, Freedom, the Good).

According to Žižek, all successful political ideologies necessarily refer to and turn around sublime objects posited by political ideologies. These sublime objects are what political subjects take it that their regime’s ideologies’ central words mean or name extraordinary Things like God, the Fuhrer, the King, in whose name they will (if necessary) transgress ordinary moral laws and lay down their lives. When a subject believes in a political ideology, as we saw in 2b above, Žižek argues that this does not mean that they know the Truth about the objects which its key terms seemingly name—indeed, Žižek will finally contest that such a Truth exists (see 3c, d). Nevertheless, by drawing on a parallel with Kant on the sublime, Žižek makes a further and more radical point. Just as in the experience of the sublime, Kant’s subject resignifies its failure to grasp the sublime object as indirect testimony to a wholly “supersensible” faculty within herself (Reason), so Žižek argues that the inability of subjects to explain the nature of what they believe in politically does not indicate any disloyalty or abnormality. What political ideologies do, precisely, is provide subjects with a way of seeing the world according to which such an inability can appear as testimony to how Transcendent or Great their Nation, God, Freedom, and so forth is—surely far above the ordinary or profane things of the world. In Žižek’s Lacanian terms, these things are Real (capital “R”) Things (capital “T”), precisely insofar as they in this way stand out from the reality of ordinary things and events.

In the struggle of competing political ideologies, Žižek hence agrees with Ernesto Laclau and Chantal Mouffe, the aim of each is to elevate their particular political perspective (about what is just, best, and so forth) to the point where it can lay claim to name, give voice to or to represent the political whole (for example: the nation). In order to achieve this political feat, Žižek argues, each group must succeed in identifying its perspective with the extra-political, sublime objects accepted within the culture as giving body to this whole (for example: “the national interest,” “the dictatorship of the proletariat”). Or else, it must supplant the previous ideologies’ sublime objects with new such objects. In the absolute monarchies, as Ernst Kantorowicz argued, the King’s so called “second” or “symbolic” body exemplified paradigmatically such sublime political objects as the unquestionable font of political authority (the particular individual who was King was contestable, but not the sovereign’s role itself). Žižek’s critique of Stalinism, in a comparable way, turns upon the thought that “the Party” had this sublime political status in Stalinist ideology. Class struggle in this society did not end, Žižek contends, despite Stalinist propaganda. It was only displaced from a struggle between two classes (for example, bourgeois versus proletarian) to one between “the Party” as representative of the people or the whole and all who disagreed with it, ideologically positioned as “traitors” or “enemies of the people.”

3. Žižek’s Fundamental Ontology

a. The Fundamental Fantasy & the Split Law

For Žižek, as we have seen, no political regime can sustain the political consensus upon which it depends, unless its predominant ideology affords subjects a sense both of individual distance or freedom with regard to its explicit prescriptions (2b), and that the regime is grounded in some larger or “sublime” Truth (2e). Žižek’s political philosophy identifies interconnected instances of these dialectical ideas: his notion of “ideological disidentification” (2b); his contention that ideologies must accommodate subjects’ transgressive experiences of jouissance (2c); and his conception of exceptional or sublime objects of ideology (2e). Arguably the central notion in Žižek’s political philosophy intersects with Žižek’s notion of “ideological fantasy”. “Ideological fantasy” is Žižek’s technical name for the deepest framework of belief that structures how political subjects, and/or a political community, comes to terms with what exceeds its norms and boundaries, in the various registers we examined above.

Like many of Žižek’s key notions, Žižek’s notion of the ideological fantasy is a political adaptation of an idea from Lacanian psychoanalysis: specifically, Lacan’s structuralist rereading of Freud’s psychoanalytic understanding of unconscious fantasy. As for Lacan, so for Žižek, the civilizing of subjects necessitates their founding sacrifice (or “castration”) of jouissance, enacted in the name of sociopolitical Law. Subjects, to the extent that they are civilized, are “cut” from the primal object of their desire. Instead, they are forced by social Law to pursue this special, lost Thing in Žižek’s technical term, the “objet petit a” (see 4a, 4b) by observing their societies’ linguistically mediated conventions, deferring satisfaction, and accepting sexual and generational difference. Subjects’ “fundamental fantasies,” according to Lacan, are unconscious structures which allow them to accept the traumatic loss involved in this founding sacrifice. They turn around a narrative about the lost object, and how it was lost (see 3d). In particular, the fundamental fantasy of a subject resignifies the founding repression of jouissance by Law—which, according to Lacan, is necessary if the individual is to become a speaking subject—as if it were a merely contingent, avoidable occurrence. In the fantasy, that is, what for Žižek is a constitutive event for the subject, is renarrated as the historical action of some exceptional individual (in Enjoy Your Symptom! the pre-Oedipal “anal father”). Equally, the jouissance the subject considers itself to have lost is posited by the fantasy as having been taken from it by this persecutory “Other supposed to enjoy” (see 3b).

In the notion of ideological fantasy, Žižek takes this psychoanalytic framework and applies it to the understanding of the constitution of political groups. If after Plato, political theory concerns the Laws of a regime, the Laws for Žižek are always split or double in kind. Each political regime has a body of more or less explicit, usually written Laws which demand that subjects forego jouissance in the name of the greater good, and according to the letter of its proscriptions (for example, the US or French constitutions). Žižek identifies this level of the Law with the Freudian ego ideal. But Žižek argues that, in order to be effective, a regime’s explicit Laws must also harbor and conceal a darker underside, a set of more or less unspoken rules which, far from simply repressing jouissance, implicate subjects in a guilty enjoyment in repression itself, which Žižek likens to the “pleasure-in-pain” associated with the experience of Kant’s sublime (see 2d). The Freudian superego, for Žižek, names the psychical agency of the Law, as it is misrepresented and sustained by subjects’ fantasmatic imaginings of a persecutory Other supposed to enjoy (like the archetypal villain in noir films). This darker underside of the Law, Žižek agrees with Lacan, is at its base a constant imperative to subjects to jouis!, by engaging in the “inherent transgressions” of their sociopolitical community (see 2b).

Žižek’s notion of the split in the Law in this way intersects directly with his notion of ideological disidentification examined in 2b. While political subjects maintain a conscious sense of freedom from the explicit norms of their culture, Žižek contends, this disidentification is grounded in their unconscious attachment to the Law as superego, itself an agency of enjoyment. If Althusser famously denied the importance of what people “have on their consciences” in the explanation of how political ideologies work, then for Žižek the role of guilt—as the way in which the subject enjoys his subjection to the laws—is vital to understanding subjects’ political commitments. Individuals will only turn around when the Law hails them, Žižek argues, insofar as they are finally subjects also of the unconscious belief that the “big Other” has access to the jouissance they have lost as subjects of the Law, and which they can accordingly reattain through their political allegiance (see 2b). It is this belief, what could be termed this “political economy of jouissance,” that the fundamental fantasies underlying political regimes’ worldviews are there to structure in subjects.

b. Excursus: Žižek’s Typology of Ideological Regimes

With these terms of Žižek’s Lacanian ontology in place, it becomes possible to lay out Žižek’s theoretical understanding of the differences between different types of ideological-political regimes. Žižek’s works maintain a lasting distinction between modern and premodern political regimes, which he contends are grounded in fundamentally different ways of organizing subjects’ relations to Law and jouissance (3a). In Žižek’s Lacanian terms, premodern ideological regimes exemplified what Lacan calls in Seminar XVII the discourse of the master. In these authoritarian regimes, the word and will of the King or master (in Žižek’s mathemes, S1) was sovereign—the source of political authority, with no questions asked. Her/His subjects, in turn, are supposed to know (S2) the edicts of the sovereign and the Law (as the classical legal notion has it, “ignorance is no excuse”). In this arrangement, while jouissance and fantasy are political factors, as Žižek argues, regimes’ quasi-transgressive practices remain exceptional to the political arena, glimpsed only in such carnivalesque events as festivals or the types of public punishment Michel Foucault (for example) describes in the introduction to Discipline and Punish.

Žižek agrees with both Foucault and Marx that modern political regimes exert a form of power that is both less visible and more far-reaching than that of the regimes they replaced. Modern regimes, both liberal capitalist and totalitarian, for Žižek, are no longer predominantly characterized by the Lacanian discourse of the master. Given that the Oedipal complex is associated by him with this older type of political authority, Žižek agrees with the Frankfurt School theorists that, contra Deleuze and Guattari, today’s subjectivity as such is already post- or anti-Oedipal. Indeed, in Plague of Fantasies and The Ticklish Subject, Žižek contends that the characteristic discontents of today’s political world—from religious fundamentalism to the resurgence of racism in the first world—are not archaic remnants of, or protests against traditional authoritarian structures, but the pathological effects of new forms of social organization. For Žižek, the defining agency in modern political regimes is knowledge (or, in his Lacanian mathemes, S2). The enlightenment represented the unprecedented political venture to replace belief in authority as the basis of polity with human reason and knowledge. As Schmitt also complained, the legitimacy of modern authorities is grounded not in the self-grounding decision of the sovereign. It is grounded in the ability of authorities to muster coherent chains of reasons to subjects about why they are fit to govern. Modern regimes hence always claim to speak not out of ignorance of what subjects deeply enjoy (“I don’t care what you want; just do what I say!”) but in the very name of subjects’ freedom and enjoyment.

Whether fascist or communist, Žižek argues in his early books, totalitarian (as opposed to authoritarian) regimes justified their rule by final reference to quasi-scientific metanarratives. These metanarratives—a narrative concerning racial struggle in Nazism, or the Laws of History in Stalinism—each claimed to know the deeper Truth about what subjects want, and accordingly could both justify the most striking transgressions of ordinary morality, and justify these transgressions by reference to subjects’ jouissance. The most disturbing or perverse features of these regimes can only be explained by reference to the key place of knowledge in these regimes. Žižek describes, for instance, the truly Catch 22esque logic of the Soviet show trials, wherein it was not enough for subjects to be condemned by the authorities as enemies, but they were made to avow their “objective” error in opposing the party as agent of the laws of history.

Žižek’s statements on today’s liberal capitalism are complex, if not in mutual tension. At times, Žižek tries to formalize the economic generation of surplus value as a meaningfully “hysterical” social arrangement. Yet Žižek predominantly argues, that the market driven consumerism of later capitalist subjects is characterized by a marketing discourse which—like totalitarian ideologies—does not appeal to subjects in the name of any collective cause justifying individuals’ sacrifice of jouissance. Instead, as social conservatives criticize, it musters the quasi-scientific discourses of marketing and public relations, or (increasingly) Eastern religion, in order to recommend products to subjects as necessary means in the liberal pursuit of happiness and self-fulfillment. In line with this change, Žižek contends in The Ticklish Subject that the paradigmatic type of leader today is not some inaccessible boss but the uncannily familiar figure of Bill Gates—more like a little brother than the traditional father or master. Again: for Žižek it is deeply telling that at the same time as the nuclear family is being eroded in the first world, other institutions, from the so-called “nanny” welfare state to private corporations, are increasingly becoming “familiarized” (with self-help sessions for employees, company days, casual days, and so forth).

c. Kettle Logic, or Desire and Theodicy

We saw how Žižek claims that the truth of political ideologies concerns what they do, not what they say (2d). At the level of what political ideologies say, Žižek maintains, a Lacanian critical theory maintains that ideologies must be finally inconsistent. Freud famously talked of the example of a man who returns a borrowed kettle back to its owner broken. The man adduces mutually inconsistent excuses which are united only in terms of his ignoble desire to evade responsibility for breaking the kettle: he never borrowed the kettle, the kettle was already broken when he borrowed it, and when he gave the kettle back it was not really broken anyway. As Žižek reads political ideologies, they function in the same way in the political field—this is the sense of the subtitle of his 2004 Iraq: The Borrowed Kettle. As we saw in 2d, Žižek maintains that the end of political ideologies is to secure and defend the idea of the polity as a wholly unified community. When political strife, uncertainty or division occur, political ideologies and the fundamental fantasies upon which they lean (3a) operate to resignify this political discontent so that the political ideal of community can be sustained, and to deny the possibility that this discontent might signal a fundamental injustice or flaw within the regime. In what amounts to a kind of political theodicy, Žižek’s work points to a number of logically inconsistent ideological responses to political discontents, which are united only by the desire that informs them, like Freud’s “kettle logic”:

  1. Saying that these divisions are politically unimportant, transient or merely apparent.
    Or, if this explanation fails:
  2. Saying that the political divisions are in any case contingent to the ordinary run of events, so that if their cause is removed or destroyed, things will return to normal.
    Or, more perilously:
  3. Saying that the divisions or problems are deserved by the people for the sake of the greater good (in Australia in the 90s, for example, we experienced “the recession we had to have”), or as punishment for their betrayal of the national Thing.

Žižek’s view of the political functioning of sublime objects of ideology can be charted exactly in terms of this political theodicy. (see 2e) We saw in 3a, how Žižek argues that subjects’ fantasy is what allows them to come to terms with the loss of jouissance fundamental to being social or political animals. Žižek centrally maintains that such narrative attempts at political self-understanding—whether of individuals or political regimes—are ultimately unable to achieve these ends, except at the price of telling inconsistencies.

As Žižek highlights in his analyses of the political discontents in former Yugoslavia following the fall of communism, each national or political community tends to claim that its sublime Thing is inalienable, and hence utterly incapable of being understood or destroyed by enemies. Nevertheless, the invariable correlative of this emphasis on the inalienable nature of one’s Thing, Žižek argues in Tarrying with the Negative (1993), is the notion that It is simultaneously deeply fragile if not under active threat. For Žižek, this mutual inconsistency is only theoretically resolvable if, despite first appearances, we posit a materialist teaching that says that the “substance” seemingly named by political regimes’ key rallying terms (see 2e) is only sustained in their lived communal practices (as we say when someone does not get a joke, “you had to be there”). Yet political ideologies, as such, cannot avow this possibility (see 2d). Instead, ideological fantasies posit various exemplars of a persecutory enemy or, as Žižek says, “the Other of the Other” to whom the explanation of political disunity or discontent can be traced. If only this other or enemy could be removed, the political fantasy contends, the regime would be fully equitable and just. Historical examples of such figures of the enemy include “the Jew” in Nazi ideology, or the “petty bourgeois” in Stalinism.

Again: a type of “kettle logic” applies to the way these enemies are represented in political ideologies, according to Žižek. “The Jew” in Nazi ideology, for example, was an inconsistent condensation of features of both the ruling capitalist class (money grabbing, exploitation of the poor) and of the proletariat (dirtiness, sexual promiscuity, communism). The only consistency this figure has, that is, is precisely as a condensation of everything that Nazi ideology’s Aryan Volksgemeinschaft (roughly, “national community”) was constructed in response and political opposition to.

d. Fantasy as the Fantasy of Origins

In a way that has drawn some critics (Bellamy, Sharpe) to question how finally political Žižek’s political philosophy is, Žižek’s critique of ideology ultimately turns on a set of fundamental ontological propositions about the necessary limitations of any linguistic or symbolic system. These propositions concern the widely known paradoxes that bedevil any attempt by a semantic system to explain its own limits, and/or how it came into being. If what preceded the system was radically different from what subsequently emerged, how could the system have emerged from it, and how can the system come to terms with it at all? If we name the limits of what the system can understand, do not we, in that very gesture, presuppose some knowledge of what is beyond these limits, if only enough to say what the system is not? The only manner in which we can explain the origin of language is within language, Žižek notes in For They Know Not What They Do. Yet we hence presuppose, again in the very act of the explanation, the very thing we were hoping to explain. Similarly, to take the example from political philosophy of Hobbes’ explanation of the origin of sociopolitical order, the only way we can explain the origin of the social contract is by presupposing that Hobbes’ wholly pre-social men nevertheless possessed in some way the very social abilities to communicate and make pacts that Hobbes’ position is supposed to explain.

For Žižek, fantasy as such is always fundamentally the fantasy of (one’s) origins. In Freud’s “Wolf Man” case, to cite the psychoanalytic example Žižek cites in For They Know Not What They Do, the primal scene of parental coitus is the Wolf Man’s attempt to come to terms with his own origin—or to answer the infant’s perennial question “where did I come from?” The problem here is this: who could the spectacle of this primal scene have been staged for or seen by, if it really transpired before the genesis of the subject that it would explain (see 3e, 4e)? The only answer is that the Wolf Man has imaginatively transposed himself back into the primal scene if only as an impassive object-gaze—whose historical occurrence he had yet hoped would explain his origin as an individual.

Žižek’s argument is that, in the same way, political or ideological systems cannot and do not avoid deep inconsistencies. No less than Machiavelli, Žižek is acutely aware that the act that founds a body of Law is never itself legal, according to the very order of Law it sets in place. He cites Bertolt Brecht: “what is the robbing of a bank, compared to the founding of a bank?” What fantasy does, in this register, is to try to historically renarrativize the founding political act as if it were or had been legal—an impossible application of the Law before the Law had itself come into being. No less than the Wolf Man’s false transposition of himself back into the primal scene that was to explain his origin, Žižek argues that the attempt of any political regime to explain its own origins in a political myth that denies the fundamental, extralegal violence of these origins is fundamentally false. (Žižek uses the example of the liberal myth of primitive accumulation to illustrate his position in For They Know Not What They Do, but we could cite here Plato’s myth of the reversed cosmos in the Laws and Statesman, or historical cases like the idea of terra nullius in colonial Australia).

e. Exemplification: the Fall and Radical Evil (Žižek’s Critique of Kant)

In a series of places, Žižek situates his ontological position in terms of a striking reading of Immanuel Kant’s practical philosophy. Žižek argues that in “Religion Within the Bounds of Reason Alone” Kant showed that he was aware of these paradoxes that necessarily attend any attempt to narrate the origins of the Law. The Judeo-Christian myth of the fall succumbs to precisely these paradoxes, as Kant analyses: if Adam and Eve were purely innocent, how could they have been tempted?; if their temptation was wholly the fault of the tempter, why then has God punished humans with the weight of original sin?; but if Adam and Eve were not purely innocent when the snake lured them, in what sense was this a fall at all? According to Žižek, Kant’s text also provides us with theoretical parameters which allow us to explain and avoid these paradoxes. The problems for the mythical narrative, Kant argues, hail from its nature as a narrative—or how it tries to render in a historical story what he argues is truly a logical or transcendental priority. For Kant, human beings are, as such, radically evil. They have always already chosen to assert their own self-conceit above the moral Law. This choice of radical evil, however, is not itself a historical choice either for individuals or for the species, for Kant. This choice is what underlies and opens up the space for all such historical choices. However, as Žižek argues, Kant withdraws from the strictly diabolical implications of this position. The key place in which this withdrawal is enacted is in the postulates of The Critique of Practical Reason, wherein Kant defends the immortality of the soul as a likely story on the basis of our moral experience. Because of radical evil, Kant argues, it is impossible for humans to ever act purely out of duty in this life—this is what Kant thinks our irremovable sense of moral guilt attests. But because people can never act purely in this life, Kant suggests, it is surely reasonable to hope and even to postulate that the soul lives on after death, striving ever closer towards the perfection of its will.

Žižek’s contention is that this argument does not prove the immortality of a disembodied soul. It proves the immortality of an embodied individual soul, always struggling guiltily against its selfish corporeal impulses (this, incidentally, is one reason why Žižek argues, after Lacan, that de Sade is the truth of Kant). In order to make his proof even plausible, Žižek notes, Kant has to tacitly smuggle the spatiotemporal parameters of embodied earthly existence into the postulated hereafter so that the guilty subject can continue endlessly to struggle against his radically evil nature towards good. In this way, though, Kant himself has to speak as if he knew what things are like on the other side of death—which is to say, from the impossible, because impossibly neutral, perspective of someone able to impassively see the spectacle of the immortal subject striving guiltily towards the good (see 4d). But in this way, also, Žižek argues that Kant enacts exactly the type of fantasmatic operation his reading of the fall (as a) narrative declaims, and which represents in nuce the basis operation also of all political ideologies.

4. From Ontology to Ethics—Žižek’s Reclaiming of the Subject

a. Žižek’s Subject, Fantasy, and the Objet Petit a

Perhaps Žižek’s most radical challenge to accepted theoretical opinion is his defense of the modern, Cartesian subject. Žižek knowingly and polemically positions his writings against virtually all other contemporary theorists, with the significant exception of Alain Badiou. Yet for Žižek, the Cartesian subject is not reducible to the fully self-assured “master and possessor of nature” of Descartes’ Discourses. It is what Žižek calls in “Kant With (Or Against) Kant,” an out of joint ontological excess or clinamen. Žižek takes his bearings here as elsewhere from a Lacanian reading of Kant, and the latter’s critique of Descartes’ cogito ergo sum. In the “Transcendental Dialectic” in The Critique of Pure Reason, Kant criticized Descartes’ argument that the self-guaranteeing “I think” of the cogito must be a thinking thing (res cogitans). For Kant (as for Žižek), while the “I think” must be capable of accompanying all of the subject’s perceptions, this does not mean that it is itself such a substantial object. The subject that sees objects in the world cannot see itself seeing, Žižek notes, any more than a person can jump over her own shadow. To the extent that a subject can reflectively see itself, it sees itself not as a subject but as one more represented object, what Kant calls the “empirical self” or what Žižek calls the “self” (versus the subject) in The Plague of Fantasies. The subject knows that it is something, Žižek argues. But it does not and can never know what Thing it is “in the Real”, as he puts it (see 2e). This is why it must seek clues to its identity in its social and political life, asking the question of others (and of the big Other (see 2b)) which Žižek argues defines the subject as such: che voui? (what do you want from me?). In Tarrying With the Negative, Žižek hence reads the Director’s Cut of Ridley Scott’s Bladerunner as revelatory of the Truth of the subject. Within this version of the film, as Žižek emphasizes, the main character Deckard literally does not know what he is—a robot that perceives itself to be human. According to Žižek, the subject is a “crack” in the universal field or substance of being, not a knowable thing (see 4d). This is why Žižek repeatedly cites in his books the disturbing passage from the young Hegel describing the modern subject not as the “light” of the modern enlightenment, but “this night, this empty nothing …”

It is crucial to Žižek’s position, though, that Žižek denies the apparent implication of this that the subject is some kind of supersensible entity, for example, an immaterial and immortal soul, and so forth. The subject is not a special type of Thing outside of the phenomenal reality we can experience, for Žižek. As we saw in 1e above, such an idea would in fact reproduce in philosophy the type of thinking which, he argues, characterizes political ideologies and the subject’s fundamental fantasy (see 3a). It is more like a fold or crease in the surface of this reality, as Žižek puts it in Tarrying With the Negative, the point within the substance of reality wherein that substance is able to look at itself, and see itself as alien to itself. According to Žižek, Hegel and Lacan add to Kant’s reading of the subject as the empty “I think” that accompanies any individual’s experience the caveat that, because objects thus appear to a subject, they always appear in an incomplete or biased way. Žižek’s “formula” of the fundamental fantasy (see 2a, 2d) “$ <> a” tries to formalize exactly this thought. Its meaning is that the subject ($), in its fundamental fantasy, misrecognizes itself as a special object (the objet petit a or lost object (see 2a)) within the field of objects that it perceives. In terms which unite this psychoanalytic notion with Žižek’s political philosophy, we can say that the objet petit a is exactly a sublime object (2e). It is an object that is elevated or, in Freudian terms, “sublimated” by the subject to the point where it stands as a metonymic representative of the jouissance the subject unconsciously fantasizes was taken from her/him at castration (3a). It hence functions as the object-cause of the subject’s desire that exceptional “little piece of the Real” that s/he seeks out in all of her/his love relationships. Its psychoanalytic paradigms are, to cite the title of a collection Žižek edited, “the voice and gaze as love objects”. Examples of the voice as object petit a include the persecutor’s voice in paranoia, or the very silence that some TV advertisements now use, and which captures our attention by making us wonder whether we may not have missed something. The preeminent Lacanian illustration of the gaze as object petit a is the anamorphotic skull at the foot of Holbein’s Ambassadors, which can only be seen by a subject who looks at it awry, or from an angle. Importantly, then, neither the voice nor the gaze as objet petit a attest to the subject’s sovereign ability to wholly objectify (and hence control) the world it surveys. In the auditory and visual fields (respectively), the voice and the gaze as objet petit a represent objects like Kant’s sublime things that the subject cannot wholly get its head around, as we say. The fact that they can only be seen or heard from particular perspectives indicates exactly how the subject’s biased perspective—and so his/her desire, what s/he wants—has an effect on what s/he is able to see. They thereby bear witness to how s/he is not wholly outside of the reality s/he sees. Even the most mundane but telling example of this subjective objet petit a of Lacanian theory is someone in love, of whom we commonly say that they are able to see in their lover something special, an “X factor,” which others are utterly blind to. In the political field, similarly—and as we saw in part 2c—subjects of a particular political community will claim that others cannot understand their regime’s sublime objects. Indeed, as Žižek comments about the resurgence of racism across the first world today, it is often precisely the strangeness of others’ particular ethnic or national Things that animates subjects’ hatred towards them.

b. The Objet Petit a & the Virtuality of Reality

In Žižek’s theory, the objet petit a stands as the exact opposite of the object of the modern sciences, that can only be seen clearly and distinctly if it is approached wholly impersonally. If the objet petit a is not looked at from a particular, subjective perspective—or, in the words of one of Žižek’s titles, by “looking awry” —it cannot be seen at all. This is why Žižek believes this psychoanalytic notion can be used to structure our understanding of the sublime objects postulated by ideologies in the political field, which as we saw in 3c show themselves to be finally inconsistent when they are looked at dispassionately. What Žižek’s Lacanian critique of ideology aims to do is to demonstrate such inconsistencies, and thereby to show us that the objects most central to our political beliefs are Things whose very sublime appearance conceals from us our active agency in constructing and sustaining them. (We will return to this thought in 4d and 4e below).

Žižek argues that the first place that the objet petit a appeared in the history of Western philosophy was with Kant’s notion of the transcendental object in The Critique of Pure Reason. Analyzing this Kantian notion allows us to elaborate more precisely the ontological status of the objet petit a. Kant defines the transcendental object as “the completely indeterminate thought of an object in general.” Like the objet petit a, then, Kant’s transcendental object is not a normal phenomenal object, although it has a very specific function in Kant’s epistemological conception of the subject. The avowedly anti-Humean function of this Kantian positing in the “Transcendental Deduction” is to ensure that the purely formal categories of the subject’s understanding can actually affect and indeed structure the manifold of the subject’s sensuous intuition. As Žižek stresses, that is, the transcendental object functions in Kant’s epistemology to guarantee that sense will continue to emerge for the subject, no matter what particular objects s/he might encounter.

We saw in 3c how Žižek argues that ideologies adduce ultimately inconsistent reasons to support the same goal of political unity. According to Žižek, as we can now elaborate, this is because the deepest political function of sublime objects of ideology is to ensure that the political world will make sense for subjects no matter what events transpire, in a way that he directly compares with Kant’s transcendental object. No matter what evidence someone might produce that all Jewish people are not acquisitive, capitalist, cunning, for example, a true Nazi will be able to immediately resignify this evidence by reference to his ideological notion of “the Jew”: “surely it is part of their cunning to appear as though they are not truly cunning,” and so forth. Importantly, it follows for Žižek that political community is always, in its very structure, an anticipated community. Subjects’ sense of political belonging is always mediated, according to him, by their shared belief in their regime’s key words or master signifiers. But these are words whose only “meaning” lies finally in their function, which is to guarantee that there will (continue to) be meaning. There is, Žižek argues, ultimately no actual, Real Thing better than the other real things subjects encounter that these words name (2e). It is only by acting as if there were such a Thing that community is maintained. This is why Žižek specifies in The Indivisible Reminder that political identification can only be, “at its most basic, identification with the very gesture of identification”:

…the coordination [between subjects in a political community] concerns not the level of the signified [of some positive shared concern] but the level of the signifier. [In political ideologies], undecidability with regard to the signified (do others really intend the same as me?) converts into an exceptional signifier, the empty Master-Signifier, the signifier-without-signified. ‘Nation’, ‘Democracy’, ‘Socialism’ and other Causes stand for that ‘something’ about which we are never sure what, exactly, it is – the point is, rather, that identifying with the Nation we signal our acceptance of what others accept, with a Master-Signifier which serves as the rallying point for all the others. (Žižek, 1996: 142)

This is the sense also in which Žižek claims in Plague of Fantasies that today’s virtual reality is “not virtual enough.” It is not virtual enough because the many options it offers subjects to enjoy (jouis) are transgressive or exotic possibilities. VR leaves nothing to the imagination or, in Žižek’s Lacanian terms, to fantasy. Fantasy, as we saw in 2a, operates to structure subjects’ beliefs about the jouissance which must remain only the stuff of imagination, purely “virtual” for subjects of the social law. For Žižek, then, it is identification with this law, as mediated via subjects’ anticipatory identifications with what they suppose others believe, that involves true virtuality.

c. Forced Choice & Ideological Tautologies

As 4b confirms (and as we commented in 1c), Žižek’s political philosophy turns around the idea that the central words of political ideologues are at base “signifiers without signified,” words that only appear to refer to exceptional Things, and which thereby facilitate the identification between subjects. As Žižek argues, these sublime objects of ideology have exactly the ontological status of what Kant called “transcendental illusions”—illusions whose semblance conceals that there is nothing behind them to conceal. Ideological subjects do not know what they do when they believe in them, Žižek contends. Yet, through the presupposition that the Other(s) know (2c), and their participation in the practices involving inherent transgression of their political community (2c), they “identify with the very gesture of identification” (4b). Hence, their belief, coupled with these practices, is politically efficient.

One of Žižek’s most difficult, but also deepest, claims is that the particular sublime objects of ideology with which subjects identify in different regimes (the Nation, the People, and so forth) each give particular form to a meta-law (law about all other laws) that binds any political community as such. This is the meta-law that says simply that subjects must obey all the other laws. In 2b above, we saw how Žižek holds that political ideologies must allow subjects the sense of subjective distance from their explicit directives. Žižek’s critical position is that this apparent freedom ideologies thereby allow subjects is finally a lure. Like the choice offered Yossarian by the “catch 22” of Joseph Heller’s novel, the only option truly available to political subjects is to continue to abide by the laws. No regime can survive if it waives this meta-law. The Sublime Object of Ideology hence cites with approval Kafka’s comment that it is not required that subjects think the law is just, only that it is necessary. Yet no regime, despite Kafka, can directly avow its own basis in such naked self-assertion without risking the loss of all legitimacy, Žižek agrees with Plato. This is why it must ground itself in ideological fantasies (3a) which at once sustain subjects’ sense of individual freedom (2c) and the sense that the regime itself is grounded extra-politically in the Real, and some transcendent, higher Good (2e).

This thought underlies the importance Žižek accords in For They Know Not What They Do to Hegel’s difficult notion of tautology as the highest instance of contradiction in The Science of Logic. If you push a subject hard enough about why they abide by the laws of their regime, Žižek holds that their responses will inevitably devolve into some logical variant of Exodus 3:14’s “I am that I am” statements of the form “because the Law (God / the People/ the Nation) is … the Law (God / the People / the Nation)”. In such tautological statements, our expectation that the predicates in the second half of the sentence will add something new to the (logical) subject given at its beginning is “contradicted,” Hegel argues. There is indeed something even sinister when someone utters such a sentence in response to our enquiries, Žižek notes—as if, when (for example) “the Law” is repeated dumbly as its own predicate (“because the law is the law”), it intimates the uncanny dimension of jouissance the law as ego ideal usually proscribes (3a). What this uncanny effect of sense attests to, Žižek argues in For They Know Not What They Do, is the usually “primordially repressed” force of the universal meta-law (that everyone must obey the laws) being expressed in the different, particular languages of political regimes: “because the People are the People,” “because the Nation is the Nation”, and so forth.

Žižek’s ideology critique hence contends that all political regimes’ ideologies always devolve finally around a set of such tautological propositions concerning their particular sublime objects. In The Sublime Object of Ideology, Žižek gives the example of a key Stalinist proposition: “the people always support the party.” On its surface, this proposition looks like a proposition that asserts something about the world, and which might be susceptible of disproof: perhaps there are some Soviet citizens who do not support the party, or who disagree with this or that of the party’s policies. What such an approach misses, however, is how in this ideology, what is referred to as “the people” in fact means “all those who support the party.” In Stalinism, that is, “the party” is the fetishized particular that stands for the people’s true interests (see 1e). Hence, the sentence “the people always support the party” is a concealed form of tautology. Any apparent people who in fact do not support the party by that fact alone are no longer “people” within Stalinist ideology.

d. The Substance is Subject, the Other Does Not Exist

In 4b, we saw how Žižek argues that political identification is identification with the gesture of identification. In 4c, we saw how the ultimate foundation of a regimes’ laws is a tautologous assertion of the bare political fact that there is law. What unites these two positions is the idea that the sublime objects of a political regime and the ideological fantasies that give narratives about their content conceal from subjects the absence of any final ground for Law beyond the fact of its own assertion, and the fact that subjects take it to be authoritative. Here as elsewhere, Žižek’s work surprisingly approaches leading motifs in the political philosophy of Carl Schmitt.

Importantly, once this position is stated, we can also begin to see how Žižek’s post-Marxist project of a critique of ideology intersects with his philosophical defense of the Cartesian subject. At several points in his oeuvre, Žižek cites Hegel’s statement in the “Introduction” to the Phenomenology of Spirit that “the substance is subject” as a rubric that describes the core of his own political philosophy. According to Žižek, critics have misread this statement by taking it to repeat the founding, triumphalist idea of modern subjectivity as such—namely, that the subject can master all of nature or “substance.” Žižek contends, controversially, that Hegel’s claim ought to be read in a directly opposing sense. For him, it indicates the truth that there can be no dominant political regime or, in Hegel’s terms, no “social substance” that does not depend for its authority upon the active, indeed finally anticipatory (4c) investment of subjects in it. Like the malign computer machines in The Matrix that literally run off the human jouissance they drain from deluded subjects, for Žižek the big Other of any political regime does not exist as a self-sustaining substance. It must ceaselessly run on the belief and actions of its subjects, and their jouissance (2c)—or, to recur to the example we looked at in 2d, the King will not be the King, for Žižek, unless he has his subjects. It is certainly telling that the leading examples of ideological tautology For They know What They Do discusses invoke precisely some subject’s will or decision as when a parent says to a child “do this … because I said so,” or when people do something “… because the King said so,” which means that no more questions can be asked.

In 4a, we saw how Žižek denies that the subject, because it is not itself a perceptible object, belongs to an order of being wholly outside of the order of experience. To elevate such a wholly Other order would, he argues, reproduce the elementary operation of the fundamental fantasy. We can now add to this thought the further position that the Cartesian subject is, according to Žižek, is finally nothing other than the irreducible point of active agency responsible for the always minimally precipitous political gesture of laying down a regime’s law. For Žižek, accordingly, the critical question to be asked of any theoretical or political position that posits some exceptional Beyond, as we saw in his reading of Kant (2e) is: from which subject-position do you speak when you claim a knowledge of this Beyond? As we saw in 2e, Žižek’s Lacanian answer is that the perspective that one always presupposes when one speaks in this manner is one that is always “superegoic” (see 2a)—tied to what he terms in Metastases of Enjoyment a “malevolently neutral” God’s eye view from nowhere. It is deeply revealing, from Žižek’s perspective, that the very perspective which allows the Kantian subject in the “dynamic sublime” to resignify its own finitude as itself a source of pleasure-in-pain (jouissance) is precisely one which identifies with the supersensible moral Law, before which the sensuous subject remains irredeemably guilty, infinitely striving to pay off its moral debt. As Žižek cites Hegel’s Phenomenology of Spirit:

It is manifest that beyond the so-called curtain [of phenomena] which is supposed to conceal the inner world, there is nothing to be seen unless we go behind it ourselves as much in order that we may see, as that there may be something behind there which can be seen. (Žižek, 1989: 196, emphasis added)

In other words, Žižek’s final position about the sublime objects of political regimes’ ideologies is that these belief inspiring objects are so many ways in which the subject misrecognizes its own active capacity to challenge existing laws, and to found new laws altogether. Žižek repeatedly argues that the most uncanny or abyssal Thing in the world is the subject’s own active subjectivity—which is why he also repeatedly cites the Eastern saying that “Thou art that.” It is finally the singularity of the subject’s own active agency that subjects misperceive in fantasies concerning the sublime objects of their regimes’ ideologies, in the face of which they can do nothing but reverentially abide by the rules. In this way, it is worth noting, Žižek’s work can claim a heritage not only of Hegel, but also from the Left Hegelians, and Marx’s and Feuerbach’s critiques of religion.

e. The Ethical Act Traversing the Fantasy

Žižek’s technical term for the process whereby we can come to recognize how the sublime objects of our political regimes’ ideologies are, like Marx’s commodities, fetish objects that conceal from subjects their own political agency is “traversing of the fantasy.” Traversing the fantasy, for Žižek, is at once the political subject’s deepest form of self-recognition, and the basis for his own radical political position or defense of the possibility of such positions. Žižek’s entire theoretical work directs us towards this “traversing of the fantasy” in the many different fields on which he has written, and despite the widespread consensus at the beginning of the new century that fundamental political change is no longer possible or desirable.

Insofar as political ideologies for Žižek, like for Althusser (see 2c), remain viable only because of the ongoing practices and beliefs of political subjects, this traversal of fantasy must always involve an active, practical intervention in the political world, which changes a regime’s political institutions. As for Kant, so for Žižek, the practical bearing of critical reason comes first, in his critique of ideology, and last, in his advocacy of the possibility of political change. Žižek hence also repeatedly speaks of traversing the fantasy in terms of an “Act” (capital “A”), which differs from normal human speech and action. Everyday speech and action typically does not challenge the framing sociopolitical parameters within which it takes place, Žižek observes. By contrast, what he means by an Act is an action which “touches the Real” (as he says) of what a sociopolitical regime has politically repressed or wiped its hands of, and which it cannot publicly avow without risking fundamental political damage (see 2c). In this way, the Žižekian Act extends and changes the very political and ideological parameters of what is permitted within a regime, in the hope of bringing into being new parameters in the light of which its own justice will be able to be retrospectively seen. This is the point of significant parallel with Alain Badiou’s work, whose influence Žižek has increasingly avowed in his more recent books. Notably, as Žižek specifies in The Indivisible Remainder, the Act as what it is effectively repeats the very act that he claims founds all political regimes as such, namely, the excessive, law founding gesture we examined in 4c. Just as the current political regime originated in a founding gesture excessive with regard to the laws it set in place, Žižek argues, so too can this political regime itself be superseded, and a new one replace it. In his reading of Walter Benjamin’s “Theses on the Philosophy of History” in The Sublime Object of Ideology, Žižek indeed argues that such a new Act also effectively repeats all previous, failed attempts at changing an existing political regime, which otherwise would be consigned forever to historical oblivion.

5. Conclusion

Slavoj Žižek’s work represents a striking challenge within the contemporary philosophical scene. Žižek’s very style, and his prodigious ability to write and examine examples from widely divergent fields, is a remarkable thing. His work reintroduces and reinvigorates for a wider audience ideas from the works of German Idealism. Žižek’s work is framed in terms of a polemical critique of other leading theorists within today’s new left or liberal academy (Derrida, Habermas, Deleuze), which claims to unmask their apparent radicality as concealing a shared recoil from the possibility of a subjective, political Act which in fact sits comfortably with a passive resignation to today’s political status quo. Not the least interesting feature of his work, politically, is indeed how Žižek’s critique of the new left both significantly mirrors criticisms from conservative and neoconservative authors, yet hails from an avowedly opposed political perspective. In political philosophy, Žižek’s Lacanian theory of ideology presents a radically new descriptive perspective that affords us a unique purchase on many of the paradoxes of liberal consumerist subjectivity, which is at once politically cynical (as the political right laments) and politically conformist (as the political left struggles to come to terms with). Prescriptively, Žižek’s work challenges us to ask questions about the possibility of sociopolitical change that have otherwise rarely been asked after 1989, including: what forms such changes might take?; and what might justify them or make them possible?

Looked at in a longer perspective, it is of course too soon to judge what the lasting effects of Žižek’s philosophy will be, especially given Žižek’s own comparative youth as a thinker (Žižek was born in 1949). In terms of the history of ideas, in particular, while Žižek’s thought certainly turns on their heads many of today’s widely accepted theoretical notions, it is surely a more lasting question whether his work represents any more lasting a break with the parameters that Kant’s critical philosophy set out in the three Critiques.

6. References and Further Reading

a. Primary Literature (Books by Žižek)

  • Iraq The Borrowed Kettle, New York: Verso, 2004.
  • Organs Without Bodies: On Deleuze and Consequences, New York, London: Routledge, 2003.
  • The Puppet and the Dwarf, New York: Routledge, 2003.
  • Did Somebody Say Totalitarianism? Five Essays on the (Mis)Use of a Notion, London; New York: Verso, 2001.
  • The Fright of Real Tears, Kieslowski and The Future, Bloomington: Indiana University Press, 2001.
  • On Belief, London: Routledge, 2001.
  • The Fragile Absolute or Why the Christian Legacy is Worth Fighting For, London; New York: Verso, 2000.
  • The Art of the Ridiculous Sublime, On David Lynch’s Lost Highway, Walter Chapin Center for the Humanities: University of Washington, 2000.
  • Contingency, Hegemony, Universality: Contemporary Dialogues on the Left, Judith Butler, Ernesto Laclau and SZ. London; New York: Verso, 2000.
  • Enjoy Your Symptom! Jacques Lacan in Hollywood and Out, second expanded edition, New York: Routledge, 2000.
  • The Ticklish Subject: The Absent Centre of Political Ontology, London; New York: Verso, 1999.
  • The Abyss Of Freedom Ages Of The World, with F.W.J. von Schelling, Ann Arbor: University of Michigan Press, 1997.
  • The Plague of Fantasies, London; New York: Verso, 1997.
  • Gaze And Voice As Love Objects, Renata Salecl and SZ editors. Durham: Duke University Press, 1996.
  • The Indivisible Remainder: An Essay On Schelling And Related Matters, London; New York: Verso, 1996.
  • The Metastases Of Enjoyment: Six Essays On Woman And Causality (Wo Es War), London; New York: Verso, 1994.
  • Mapping Ideology, SZ editor. London; New York: Verso, 1994.
  • Tarrying With The Negative: Kant, Hegel And The Critique Of Ideology, Durham: Duke University Press, 1993.
  • Enjoy Your Symptom! Jacques Lacan In Hollywood And Out, London; New York: Routledge, 1992.
  • Everything You Always Wanted to Know about Lacan (But Were Afraid To Ask Hitchcock), SZ editor. London; New York: Verso, 1992.
  • Looking Awry: an Introduction to Jacques Lacan through Popular Culture, Cambridge, Mass.: MIT Press, 1991.
  • For They Know Not What They Do: Enjoyment As A Political Factor, London; New York: Verso, 1991.
  • The Sublime Object of Ideology, London; New York: Verso, 1989.

b. Secondary Literature (Texts on Žižek)

  • Slavoj Žižek: A Little Piece of the Real, Matthew Sharpe, Hants: Ashgate, 2004.
  • Slavoj Žižek: A Critical Introduction, Ian Parker, London: Pluto Press, 2004.
  • Slavoj Žižek: Live Theory, Rex Butler, London: Continuum, 2004.
  • Žižek: A Critical Introduction, Sarah Kay, London: Polity, 2003.
  • Slavoj Žižek (Routledge Critical Thinkers), Tony Myers, London: Routledge, 2003.

 

Author Information

Matthew Sharpe
Email: matthew.sharpe@dewr.gov.au

Australia

Karl Popper: Philosophy of Science

Karl PopperKarl Popper (1902-1994) was one of the most influential philosophers of science of the 20th century. He made significant contributions to debates concerning general scientific methodology and theory choice, the demarcation of science from non-science, the nature of probability and quantum mechanics, and the methodology of the social sciences. His work is notable for its wide influence both within the philosophy of science, within science itself, and within a broader social context.

Popper’s early work attempts to solve the problem of demarcation and offer a clear criterion that distinguishes scientific theories from metaphysical or mythological claims. Popper’s falsificationist methodology holds that scientific theories are characterized by entailing predictions that future observations might reveal to be false. When theories are falsified by such observations, scientists can respond by revising the theory, or by rejecting the theory in favor of a rival or by maintaining the theory as is and changing an auxiliary hypothesis. In either case, however, this process must aim at the production of new, falsifiable predictions, while Popper recognizes that scientists can and do hold onto theories in the face of failed predictions when there are no predictively superior rivals to turn to. He holds that scientific practice is characterized by its continual effort to test theories against experience and make revisions based on the outcomes of these tests. By contrast, theories that are permanently immunized from falsification by the introduction of untestable ad hoc hypotheses can no longer be classified as scientific. Among other things, Popper argues that his falsificationist proposal allows for a solution of the problem of induction, since inductive reasoning plays no role in his account of theory choice.

Along with his general proposals regarding falsification and scientific methodology, Popper is notable for his work on probability and quantum mechanics and on the methodology of the social sciences. Popper defends a propensity theory of probability, according to which probabilities are interpreted as objective, mind-independent properties of experimental setups. Popper then uses this theory to provide a realist interpretation of quantum mechanics, though its applicability goes beyond this specific case. With respect to the social sciences, Popper argued against the historicist attempt to formulate universal laws covering the whole of human history and instead argued in favor of methodological individualism and situational logic.

Table of Contents

  1. Background
  2. Falsification and the Criterion of Demarcation
    1. Popper on Physics and Psychoanalysis
    2. Auxiliary and Ad Hoc Hypotheses
    3. Basic Sentences and the Role of Convention
    4. Induction, Corroboration, and Verisimilitude
  3. Criticisms of Falsificationism
  4. Realism, Quantum Mechanics, and Probability
  5. Methodology in the Social Sciences
  6. Popper’s Legacy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Background

Popper began his academic studies at the University of Vienna in 1918, and he focused on both mathematics and theoretical physics. In 1928, he received a PhD in Philosophy. His dissertation, On the Problem of Method in the Psychology of Thinking, dealt primarily with the psychology of thought and discovery. Popper later reported that it was while writing this dissertation that he came to recognize “the priority of the study of logic over the study of subjective thought processes” (1976, p. 86), a sentiment that would be a primary focus in his more mature work in the philosophy of science.

In 1935, Popper published Logik der Forschung (The Logic of Research), his first major work in the philosophy of science.  Popper later translated the book into English and published it under the title The Logic of Scientific Discovery (1959). In the book, Popper offered his first detailed account of scientific methodology and of the importance of falsification. Many of the arguments in this book, as well as throughout his early work, are directed against members of the so-called “Vienna Circle,” such as Moritz Schlick, Otto Neurath, Rudolph Carnap, Hans Reichenbach, Carl Hempel, and Herbert Feigl, among others. Popper shared these thinkers’ concern with general issues of scientific methodology, and he sympathized with their distrust of traditional philosophical methodology. His proposed solutions to the problems arising from these concerns, however, were significantly different from those favored by the Vienna Circle.

Popper stayed in Vienna until 1937, when he took a teaching position at Canterbury University College in Christchurch, New Zealand, and he stayed there throughout World War II. His major works on the philosophy of science from this period include the articles that would eventually make up The Poverty of Historicism (1957). In these articles, he offered a highly critical analysis of the methodology of the social sciences, in particular, of attempts by social scientists to formulate predictive, explanatory laws.

In 1946, Popper took a teaching position at the London School of Economics, where he stayed until he retired in 1969. While there, he continued to work on a variety of issues relating to the philosophy of science, including quantum mechanics, entropy, evolution, and the realism vs. anti-realism debate, along with the issues already mentioned. His major works from this period include “The Propensity Interpretation of Probability” (1959) and Conjectures and Refutations (1963). He continued to publish until shortly before his death in 1994. In The Philosophy of Karl Popper (1974), Popper offers responses to many of his most important critics and provides clarifications of his mature views. His intellectual autobiography Unended Quest (1976) gives a detailed account of Popper’s evolving views, especially as they relate to the philosophy of science.

2. Falsification and the Criterion of Demarcation

Much of Popper’s early work in the philosophy of science focuses on what he calls the problem of demarcation, or the problem of distinguishing scientific (or empirical) theories from non-scientific theories. In particular, Popper aims to capture the logical or methodological differences between scientific disciplines, such as physics, and non-scientific disciplines, such as myth-making, philosophical metaphysics, Freudian psychoanalysis, and Marxist social criticism.

Popper’s proposals concerning demarcation can be usefully seen as a response to the verifiability criterion of demarcation proposed by logical empiricists, such as Carnap and Schlick. According to this criterion, a statement is cognitively meaningful if and only if it is, in principle, possible to verify. This criterion is intended to, among other things, capture the idea that the claims of empirical science are meaningful in a way that the claims of traditional philosophical metaphysics are not. For example, this criterion entails that claims about the locations of mid-sized objects are meaningful, since one can, in principle, verify them by going to the appropriate location. By contrast, claims about the fundamental nature of causation are not meaningful.

While Popper shares the belief that there is a qualitative difference between science and philosophical metaphysics, he rejects the verifiability criterion for several reasons. First, it counts existential statements (like “unicorns exist”) as scientific, even though there is no way of definitively showing that they are false. After all, the mere fact that one has failed to see a unicorn in a particular place does not establish that unicorns could not be observed in some other place. Second, it inappropriately counts universal statements (like “all swans are white”) as meaningless simply because they can never be conclusively verified. These sorts of universal claims, though, are common within science, and certain observations (like the observation of a black swan) can clearly show them to be false. Finally, the verifiability criterion is by its own light not meaningful, since it cannot be verified.

Partially in response to worries such as these, the logical empiricists’ later work abandons the verifiability criterion of meaning and instead emphasizes the importance of the empirical confirmation of scientific theories. Popper, however, argues that verification and confirmation played no role in formulating a satisfactory criterion of demarcation. Instead, Popper proposes that scientific theories are characterized by being bold in two related ways. First, scientific theories regularly disagree with accepted views of the world based on common sense or previous theoretical commitments. To an uneducated observer, for example, it may seem obvious that Earth is stationary, while the sun moves rapidly around it. However, Copernicus posited that Earth in fact revolved around the sun. In a similar way, it does not seem as though a tree and a human share a common ancestor, but this is what Darwin’s theory of evolution by natural selection claims. As Popper notes, however, this sort of boldness is not unique to scientific theories, since most mythological and metaphysical theories also make bold, counterintuitive claims about the nature of reality. For example, the accounts of world creation provided by various religions would count as bold in this sense, but this does not mean that they thereby count as scientific theories.

With this in mind, he goes on argue that scientific theories are distinguished from non-scientific theories by a second sort of boldness: they make testable claims that future observations might reveal to be false. This boldness thus amounts to a willingness to take a risk of being wrong. On Popper’s view, scientists investigating a theory make repeated, honest attempts to falsify the theory, whereas adherents of pseudoscientific or metaphysical theories routinely take measures to make the observed reality fit the predictions of the theory. Popper describes his proposal as follows:

Thus my proposal was, and is, that it is this second boldness, together with the readiness to look for tests and refutations, which distinguished “empirical” science from non-science, and especially from pre-scientific myths and metaphysics (1974, pp. 980-981)

In other places, Popper calls attention to the fact that scientific theories are characterized by possessing potential falsifiers—that is, that they make claims about the world that might be discovered to be false. If these claims are, in fact, found to be false, then the theory as a whole is said to be falsified. Non-scientific theories, by contrast, do not have any such potential falsifiers—there is literally no possible observation that could serve to falsify these theories.

Popper’s falsificationist proposal differs from the verifiability criterion in several important ways. First, Popper does not hold that non-scientific claims are meaningless. Instead, he argues that such unfalsifiable claims can often serve important roles in both scientific and philosophical contexts, even if we are incapable of ascertaining their truth or falsity. Second, while Popper is a realist who holds that scientific theories aim at the truth (see Section 4), he does not think that empirical evidence can ever provide us grounds for believing that a theory is either true or likely to be true. In this sense, Popper is a fallibilist who holds that while the particular unfalsified theory we have adopted might be true, we could never know this to be the case. For these same reasons, Popper holds that it is impossible to provide justification for one’s belief that a particular scientific theory is true. Finally, where others see science progressing by confirming the truth of various particular claims, Popper describes science as progressing on an evolutionary model, with observations selecting against unfit theories by falsifying them.

a. Popper on Physics and Psychoanalysis

In order to see how falsificationism works in practice, it will help to consider one of Popper’s most memorable examples: the contrast between Einstein’s theory of general relativity and the theories of psychoanalysis defended by Sigmund Freud and Alfred Adler. We might roughly summarize the theories as follows:

General relativity (GR): Einstein’s theory of special relativity posits that the observed speed of light in a vacuum will be the same for all observers, regardless of which direction or at what velocity these observers are themselves moving. GR allows this theory to be applied to cases where acceleration or gravity plays a role, specifically by treating gravity as a sort of distortion or bend in space-time created by massive objects.

Psychoanalysis: The theory of psychoanalysis holds that human behavior is driven at least in part by unconscious desires and motives. For example, Freud posited the existence of the id, an unconscious part of the human psyche that aims toward gratifying instinctive desires, regardless of whether this is rational. However, the desires of the id might be mediated or superseded in certain circumstances by its interaction with both the self-interested ego and the moral superego.

As we can see, both theories make bold, counter-intuitive claims about the fundamental nature of reality. Moreover, both theories can account for previously observed phenomena; for example, GR allows for an accurate description of the observed perihelion of Mercury, while psychoanalysis entails that it is possible for people to consistently act in ways that are against their own long-term best interest. Finally, both of these theories enjoyed significant support among their academic peers when Popper was first writing about these issues.

Popper argues, however, that GR is scientific while psychoanalysis is not. The reason for this has to do with the testability of Einstein’s theory. As a young man, Popper was especially impressed by Arthur Eddington’s 1919 test of GR, which involved observing during a solar eclipse the degree to which the light from distant stars was shifted when passing by the sun. Importantly, the predictions of GR regarding the magnitude shift disagreed with the then-dominant theory of Newtonian mechanics. Eddington’s observation thus served as a crucial experiment for deciding between the theories, since it was impossible for both theories to give accurate predictions. Of necessity, at least one theory would be falsified by the experiment, which would provide strong reason for scientists to accept its unfalsified rival. On Popper’s view, the continual effort by scientists to design and carry out these sorts of potentially falsifying experiments played a central role in theory choice and clearly distinguished scientific theorizing from other sorts of activities. Popper also takes care to note that insofar as GR was not a unified field theory, there was no question of GR’s being the complete truth, as Einstein himself repeatedly emphasized. The scientific status of GR, then, had nothing to do with neither (1) the truth of GR as a general theory of physics (the theory was already known to false) nor (2) the confirmation of GR by evidence (one cannot confirm a false theory).

In contrast to such paradigmatically scientific theories as GR, Popper argues that non-scientific theories such as Freudian psychoanalysis do not make any predictions that might allow them to be falsified. The reason for this is that these theories are compatible with every possible observation. On Popper’s view, psychoanalysis simply does not provide us with adequate details to rule out any possible human behavior. Absent of these sorts of precise predictions, the theory can be made to fit with, and to provide a purported explanation of, any observed behavior whatsoever.

To illustrate this point, Popper offers the example of two men, one who pushes a child into the water with the intent of drowning it, and another who dives into the water in order to save the child. Popper notes that psychoanalysis can explain both of these seemingly contradictory actions. In the first case, the psychoanalyst can claim that the action was driven by a repressed component of the (unconscious) id and in the second case, that the action resulted from a successful sublimation of this exact same sort of desire by the ego and superego. The point generalizes that regardless of how a person actually behaves, psychoanalysis can be used to explain the behavior. This, in turn, prevents us from formulating any crucial experiments that might serve to falsify psychoanalysis. Popper writes:

The point is very clear. Neither Freud nor Adler excludes any particular person’s acting in any particular way, whatever the outward circumstances. Whether a man sacrificed his life to rescue a drowning child (a case of sublimation) or whether he murdered the child by drowning (a case of repression) could not possibly be predicted or excluded by Freud’s theory (1974, p. 985).

Popper allows that there are often legitimate purposes for positing non-scientific theories, and he argues that theories which start out as non-scientific can later become scientific, as we determine methods for generating and testing specific predictions based on these theories. Popper offers the example of Copernicus’s theory of a sun-centered universe, which initially yielded no potentially falsifying predictions, and so would not have counted as scientific by Popper’s criteria. However, later astronomers determined ways of testing Copernicus’s hypothesis, thus rendering it scientific. For Popper, then, the demarcation between scientific and non-scientific theories is not grounded on the nature of entities posited by theories, by the truth or usefulness of theories, or even by the degree to which we are justified in believing in such theories. Instead, falsification provides a methodological distinction based on the unique role that observation and evidence play in scientific practice.

b. Auxiliary and Ad Hoc Hypotheses

While Popper consistently defends a falsification-based solution to the problem of demarcation throughout his published work, his own explications of it include a number of qualifications to ensure a better fit with the realities of scientific practice. It is in this context that Popper introduces several of his more notable contributions to the philosophy of science, including auxiliary versus ad hoc hypotheses, basic sentences, and degrees of verisimilitude.

One immediate objection to the simple proposal regarding falsification sketched in the previous section is based on the Duhem-Quine thesis, according to which it is in many cases impossible to test scientific theories in isolation. For example, suppose that a group of investigators uses GR to deduce a prediction about the perihelion of Mercury, but then discovers that this prediction disagrees with their measurements. This failure might lead them to conclude that GR is false; however, the failure of the prediction might also plausibly be blamed on the falsity of some other proposition that the scientists relied on to deduce the apparently falsifying prediction. There are generally a large number of such propositions, concerning everything from the absence of human error to the accuracy of the scientific theories underlying the construction and application of the measuring equipment.

Popper recognizes that scientists routinely attribute the failure of experiments to factors such as this, and further grants that there is in many cases nothing objectionable about their doing so. On Popper’s view, the distinctive mark of scientific inquiry concerns the investigators’ responses to failed predictions in cases where they do not abandon the falsified theory altogether. In particular, Popper argues that a scientific theory can be legitimately saved from falsification by the introduction of an auxiliary hypothesis that allows for the generation of new, falsifiable predictions. Popper offers an example taken from the early 19th century, when astronomers noticed that the orbit of Uranus deviated significantly from what Newtonian mechanics seemed to predict. In this case, the scientists did not treat Newton’s laws as being falsified by such an observation. Instead, they considered the auxiliary hypothesis that there existed an additional and so far unobserved planet that was influencing the orbit of Uranus. They then used this auxiliary hypothesis, together with equations of Newtonian mechanics, to predict where this planet must be located. Their predictions turned out to be successful, and Neptune was discovered in 1846.

Popper contrasts this legitimate, scientific method of theory revision with the illegitimate, non-scientific use of ad hoc hypotheses to rescue theories from falsification. Here, an ad hoc hypothesis is one that does not allow for the generation of new, falsifiable predictions. Popper gives the example of Marxism, which he argues had originally made definite predictions about the evolution of society: the capitalist, free-market system would self-destruct and be replaced by joint ownership of the means of production, and this would happen first in the most highly developed economies. By the time Popper was writing in the mid-20th century, however, it seemed clear to him that these predictions were false: free market economies had not self-destructed, and the first communist revolutions happened in relatively undeveloped economies. The proponents of Marxism, however, neither abandoned the theory as falsified nor introduced any new, falsifiable auxiliary hypotheses that might account for the failed predictions. Instead, they adopted ad hoc hypotheses that immunized Marxism against any potentially falsifying observations whatsoever. For example, the continued persistence of capitalism might be blamed on the action of counter-revolutionaries but without providing an account of which specific actions these were, or what specific new predictions about society we should expect instead. Popper concludes that, while Marxism had originally been a scientific theory:

It broke the methodological rule that we must accept falsification, and it immunized itself against the most blatant refutations of its predictions. Ever since then, it can be described only as non-science—as a metaphysical dream, if you like, married to a cruel reality (1974, p. 985).

c. Basic Sentences and the Role of Convention

A second complication for the simple theory of falsification just described concerns the character of the observations that count as potential falsifiers of a theory. The problem here is that decisions about whether to accept an apparently falsifying observation are not always straightforward. For example, there is always the possibility that a given observation is not an accurate representation of the phenomenon but instead reflects theoretical bias or measurement error on the part of the observer(s). Examples of this sort of phenomenon are widespread and occur in a variety of contexts: students getting the “wrong” results on lab tests, a small group of researchers reporting results that disagree with those obtained by the larger research community, and so on.

In any specific case in which bias or error is suspected, Popper notes that researchers might introduce a falsifiable, auxiliary hypothesis allowing us to test this. And in many cases, this is just what they do: students redo the test until they get the expected results, or other research groups attempt to replicate the anomalous result obtained. Popper argues that this technique cannot solve the problem in general, however, since any auxiliary hypotheses researchers introduce and test will themselves be open to dispute in just the same way, and so on ad infinitum. If science is to proceed at all then, there must be some point at which the process of attempted falsification stops.

In order to resolve this apparently vicious regress, Popper introduces the idea of a basic statement, which is an empirical claim that can be used to both determine whether a given theory is falsifiable and thus scientific and, where appropriate, to corroborate falsifying hypotheses. According to Popper, basic statements are “statements asserting that an observable event is occurring in a certain individual region of space and time” (1959, p. 85). More specifically, basic statements must be both singular and existential (the formal requirement) and be testable by intersubjective observation (the material requirement). On Popper’s view, “there is a raven in space-time region k” would count as a basic statement, since it makes a claim about an individual raven whose existence, or lack thereof, could be determined by appropriately located observers. By contrast, the negative existential claim “there are no ravens in space-time region k” does not do this, and thus fails to qualify as a basic statement.

In order to avoid the infinite regress alluded to earlier, where basic statements themselves must be tested in order to justify their status as potential falsifiers, Popper appeals to the role played by convention and what he calls the “relativity of basic statements.” He writes as follows:

Every test of a theory, whether resulting in its collaboration or falsification, must stop at some basic statement or other which we decide to accept. If we do not come to any decision, and do not accept some basic statement or other, then the test will have led nowhere… This procedure has no natural end. Thus if the test is to lead us anywhere, nothing remains but to stop at some point or other and say that we are satisfied, for the time being. (1959, p. 86)

From this, Popper concludes that a given statement’s counting as a basic statement requires the consensus of the relevant scientific community—if the community decides to accept it, it will count as a basic statement; if the community does not accept it as basic, then an effort must be made to test the statement by using it together with other statements to deduce a statement that the relevant community will accept as basic. Finally, if the scientific community cannot reach a consensus on what would count as a falsifier for the disputed statement, the statement itself, despite initial appearances, may not actually be empirical or scientific in the relevant sense.

d. Induction, Corroboration, and Verisimilitude

Falsification also plays a key role in Popper’s proposed solution to David Hume’s infamous problem of induction. On Popper’s interpretation, Hume’s problem involves the impossibility of justifying belief in general laws based on evidence that concerns only particular instances. Popper agrees with Hume that inductive reasoning in this sense could not be justified, and he thus rejects the idea that empirical evidence regarding particular individuals, such as successful predictions, is in any way relevant to confirming the truth of general scientific laws or theories. This places Popper’s view in explicit contrast to logical empiricists such as Carnap and Hempel, who had developed extensive, mathematical systems of inductive logic intended to explicate the degree of confirmation of scientific theories by empirical evidence.

Popper argues that there are in fact two closely related problems of induction: the logical problem of induction and the psychological problem of induction. The first problem concerns the possibility of justifying belief in the truth or falsity of general laws based on empirical evidence that concerns only specific individuals. Popper holds that Hume’s argument concerning this problem “establishes for good that all our universal laws or theories remain forever guesses, conjectures, [and] hypotheses” (1974, p. 1019). However, Popper claims that while a successful prediction is irrelevant to confirming a law, a failed prediction can immediately falsify it. On Popper’s view, then, observing 1,000 white swans does nothing to increase our confidence that the hypothesis “all swans are white” is true; however, the observation of a single black swan can, subject to the caveats mentioned in previous sections, falsify this same hypothesis.

In contrast to the logical problem of induction, the psychological problem of induction concerns the possibility of explaining why reasonable people nevertheless have the expectation that unobserved instances will obey the same general laws as did previously observed instances. Hume tries to resolve the psychological problem by appeal to habit or custom, but Popper rejects this solution as inadequate, since it suggests that there is a “clash between the logic and the psychology of knowledge” (1974, p. 1019) and hence that people’s beliefs in general laws are fundamentally irrational.

Popper proposes to solve these twin problems of induction by offering an account of theory preference that does not rely upon inductive inference and thus avoids Hume’s problems altogether. While the technical details of this account evolve throughout his writings, he consistently emphasizes two main points. First, he holds that a theory with greater informative content is to be preferred to one with less content. Here, informative content is a measure of how much a theory rules out; roughly speaking, a theory with more informative content makes a greater number of empirical claims, and thus has a higher degree of falsifiability. Second, Popper holds that a theory is corroborated by passing severe tests, or “by predictions which were highly improbable in the lights of our previous knowledge (previous to the theory which was tested and corroborated)” (1963, p. 220).

It is important to distinguish Popper’s claim that a theory is corroborated by surviving a severe test from the claim that the logical empiricist view that a theory is inductively confirmed by successfully predicting events that, were the theory to have been false, would have been highly unlikely. According to the latter view, a successful prediction of this sort, subject to certain caveats, provides evidence that the theory in question is actually true. The question of theory choice is tightly tied to that of confirmation: scientists should adopt whichever theory is most probable by light of the available evidence. On Popper’s view, by contrast, corroboration provides no evidence whatsoever the theory in question is true, or even that the theory is preferable to a so-far-untested but still unfalsified rival. Instead, a corroborated theory has shown merely that it is the sort of theory that could be falsified and thus can be legitimately classified as scientific. While a corroborated theory should obviously be preferred to an already falsified rival (see Section 2), the real work here is being done by the falsified theory, which has taken itself out of contention.

While Popper consistently rejects the idea that we are justified in believing that non-falsified, well-corroborated scientific theories with high levels of informative content are either true or likely to be true, his work on degrees of verisimilitude explores the idea that such theories are closer to the truth than were the falsified theories that they had replaced. The basic idea is as follows:

  1. For a given statement H, let the content of H be the class of all of the logical consequences of So, if H is true, then all of the members of this class would be true; if H were false however, then only some members of this class would be true, since every false statement has at least some true consequences.
  2. The content of H can be broken into two parts: the truth content consisting of all the true consequences of H, and the falsity content, consisting of all of the false consequences of
  3. The verisimilitude of H is defined as the difference between the truth content of H and falsity content of H. This is intended to capture the idea that a theory with greater verisimilitude will entail more truths and fewer falsehoods than does a theory will less verisimilitude.

With this definition in hand, it might now seem that Popper could incorporate truth into his account of his theory preference: non-falsified theories with high levels of informative content were closer to the truth than either the falsified theories they replaced or their unfalsified but less informative competitors. Unfortunately, however, this definition does not work, as arguments from Tichý (1974), Miller (1974), Harris (1974), and others show. Tichý and Miller in particular demonstrate that Popper’s proposed definition cannot be used to compare the relative verisimilitude of false theories, which is Popper’s main purpose in introducing the notion of verisimilitude. While Popper (1976) explores ways of modifying his proposal to deal with these problems, he is never able to provide a satisfactory formal definition of verisimilitude. His work on this area is nevertheless invaluable in identifying a problem that has continued to interest many contemporary researchers.

3. Criticisms of Falsificationism

While Popper’s account of scientific methodology has continued to be influential, it has also faced a number of serious objections. These objections, together with the emergence of alternative accounts of scientific reasoning, have led many philosophers of science to reject Popper’s falsificationist methodology. While a comprehensive list of these criticisms and alternatives is beyond the scope of this entry, interested readers are encouraged to consult Kuhn (1962), Salmon (1967), Lakatos (1970, 1980), Putnam (1974), Jeffrey (1975), Feyerabend (1975), Hacking (1983), and Howson and Urbach (1989).

One criticism of falsificationism involves the relationship between theory and observation. Thomas Kuhn, among others, argues that observation is itself strongly theory-laden, in the sense that what one observes is often significantly affected by one’s previously held theoretical beliefs. Because of this, those holding different theories might report radically different observations, even when they both are observing the same phenomena. For example, Kuhn argues those working within the paradigm provided by classical, Newtonian mechanics may genuinely have different observations than those working within the very different paradigm of relativistic mechanics.

Popper’s account of basic sentences suggests that he clearly recognizes both the existence of this sort of phenomenon and its potential to cause problems for attempts to falsify theories. His solution to it, however, crucially depends on the ability of the overall scientific community to reach a consensus as to which statements count as basic and thus can be used to formulate tests of the competing theories. This remedy, however, looks less attractive to the extent that advocates of different theories consistently find themselves unable to reach an agreement on what sentences count as basic. For example, it is important to Popper’s example of the Eddington experiment that both proponents of classical mechanics and those of relativistic mechanics could recognize Eddington’s reports of his observations as basic sentences in the relevant sense—that is, certain possible results would falsify the Newtonian laws of classical mechanics, while other possible results would falsify GR. If, by contrast, adherents of rival theories consistently disagreed on whether or not certain reports could be counted as basic sentences, this would prevent observations such as Eddington’s from serving any important role in theory choice. Instead, the results of any such potentially falsifying experiment would be interpreted by one part of the community as falsifying a particular theory, while a different section of the community would demand that these reports themselves be subjected to further testing.  In this way, disagreements over the status of basic sentences would effectively prevent theories from ever being falsified.

This purported failure to clearly distinguish the basic statements that formed the empirical base from other, more theoretical, statements would also have consequences for Popper’s proposed criterion of demarcation, which holds that scientific theories must allow for the deduction of basic sentences whose truth or falsity can be ascertained by appropriately located observers. If, contrary to Popper’s account, there is no distinct category of basic sentences within actual scientific practice, then his proposed method for distinguishing science from non-science fails.

A second, related criticism of falsifiability contends that falsification fails to provide an accurate picture of scientific practice. Specifically, many historians and philosophers of science have argued that scientists only rarely give up their theories in the face of failed predictions, even in cases where they are unable to identify testable auxiliary hypotheses. Conversely, it has been suggested that scientists routinely adopt and make use of theories that they know are already falsified. Instead, scientists will generally hold on to such theories unless and until a better alternative theory emerges.

For example, Lakatos (1970) describes a hypothetical case where pre-Einsteinian scientists discover a new planet whose behavior apparently violates classical mechanics. Lakatos argues that, in such a case, the scientists would surely attempt to account for these observed discrepancies in the way that Popper advocates—for example, by hypothesizing the existence of a hitherto unobserved planet or dust cloud. In contrast to what he takes Popper to be arguing, however, Lakatos contends that the failure of such auxiliary hypotheses would not lead them to abandon classical mechanics, since they had no alternative theory to turn to.

In a similar vein, Putnam (1975) argues that the initial widespread acceptance of Newtonian mechanics had little or nothing to do with falsifiable predictions, since the theory made very few of these. Instead, scientists were impressed by the theory’s success in explaining previously established phenomena, such as the orbits of the planets and the behavior of the tides. Putnam argues that, on Popper’s view, accepting such an uncorroborated theory would seem to be irrational. Finally, Hacking (1983) argues that many aspects of ordinary scientific practice, including a wide variety of observations and experiments, cannot plausibly be construed as attempts to falsify or corroborate any particular theory or hypothesis. Instead, scientists regularly perform experiments that have little or no bearing on their current theories and measure quantities about which these theories do not make any specific claims.

When considering the cogency of such criticisms, it is worth noting several things. First, it is worth recalling that Popper defends falsificationism as a normative, methodological proposal for how science ought to work in certain sorts of cases and not as an empirical description intended to accurately capture all aspects of historical scientific practice. Second, Popper does not commit himself to the implausible thesis that theories yielding false predictions about a particular phenomenon must immediately be abandoned, even if it is not apparent which auxiliary hypotheses must change. This is especially true in the absence of any rival theory yielding a correct prediction. For example, Newtonian mechanics had well-known problems with predicting certain sorts of phenomena, such as the orbit of Mercury, in the years preceding Einstein’s proposals regarding special and general relativity. Popper’s proposal does not entail that these failures of prediction should have led nineteenth century scientists to abandon this theory.

This being said, Popper himself argues that the methodology of falsificationism has played an important role in the history of science and that adopting his proposal would not require a wholesale revision of existing scientific methodology. If it turns out that scientists rarely, if ever, make theory choice on the basis of crucial experiments that falsify one theory or another, then Popper’s methodological proposal looks to be considerably less appealing.

A final criticism concerns Popper’s account of corroboration and the role it plays in theory choice. Popper’s deductive account of theory testing and adoption posits that it is rational to choose highly informative, well-corroborated theories, even though we have no inductive grounds for thinking that these theories are likely to be true. For example, Popper explicitly rejects the idea that corroboration is intended as an analogue to the subjective probability or logical probability that a theory is true, given the available evidence. This idea is central to both Popper’s proposed solution to the problem of induction and to his criticisms of competing inductivist or “Bayesian” programs.

Many philosophers of science, however, including Salmon (1967, 1981), Jeffrey (1975), Howson (1984a), and Howson and Urbach (1989), have objected to this aspect of Popper’s account. One line of criticism has focused on the extent to which Popper’s falsification offers a legitimate alternative to the inductivist proposals that Popper criticizes. For example, Jeffrey (1975) points out that it is just as difficult to conclusively falsify a hypothesis as it to conclusively verify it, and he argues that Bayesianism, with its emphasis on the degree to which empirical evidence supports a hypothesis, is much more closely aligned to scientific practice than Popper’s program.

A related line of objection has focused on Popper’s contention that it is rational for scientists to rely on corroborated theories, a claim that plays a central role in his proposed solution to the problem of induction. Urbach (1984) argues that, insofar as Popper is committed to the claim that every universal hypothesis has zero probability of being true, he cannot explain the rationality of adopting a corroborated theory over an already falsified one, since both have the same probability (zero) of being true. Taking a different tack, Salmon (1981) questions whether, on Popper’s account, it would be rational to use corroborated hypotheses for the purposes of prediction. After all, corroboration is entirely a matter of hypotheses’ past performance—a corroborated hypothesis is one that has survived severe empirical tests. Popper’s account, however, does not provide us with any reason for thinking that this hypothesis will have more accurate predictions about the future than any one of the infinite number of competing uncorroborated hypotheses that are also logically compatible with all of the evidence observed up to this point.

If these objections concerning corroboration are correct, it looks as though Popper’s account of theory choice is either (1) vulnerable to the same sorts of problems and puzzles that plague accounts of theory choice based on induction or (2) does not work as an account of theory choice at all.

While the sorts of objections mentioned here have led many to abandon falsificationism, David Miller (1998) provides a recent, sustained attempt to defend a Popperian-style critical rationalism. For more details on debates concerning confirmation and induction, see the entries on Confirmation and Induction and Evidence.

4. Realism, Quantum Mechanics, and Probability

While Popper holds that it is impossible for us to justify claims that particular scientific theories are true, he also defends the realist view that “what we attempt in science is to describe (and so far as possible) explain reality” (1975, p. 40). While Popper grants that realism is, according to his own criteria, an irrefutable metaphysical view about the nature, he nevertheless thinks we have good reasons for accepting realism and for rejecting anti-realist views such as idealism or instrumentalism. In particular, he argues that realism is both part of common sense and entailed by our best scientific theories. By contrast, he contends that the most prominent arguments for anti-realism are based on a “mistaken quest for certainty, or for secure foundations on which to build” (1975, p. 42). Once one accepts the impossibility of securing such certain knowledge, as Popper contends we ought to do, the appeal of these sorts of arguments is considerably diminished.

Popper consistently emphasizes that scientific theories should be interpreted as attempts to describe a mind-independent reality. Because of this, he rejects the Copenhagen interpretation of quantum mechanics, in which the act of human measurement is seen as playing a fundamental role in collapsing the wave-function and randomly causing a particle to assume a determinate position or momentum. In particular, Popper opposes the idea, which he associates with the Copenhagen interpretation, that the probabilistic equations describing the results of potential measurements of quantum phenomena are about the subjective states of the human observers, rather than concerning mind-independent existing physical properties such as the positions or momenta of particles.

It is in the context of this debate over quantum mechanics that Popper first introduces his propensity theory of probability. This theory’s applicability, however, extends well beyond the quantum world, and Popper argues that it can be used to interpret the sorts of claims about probability that arise both in other areas of science and in everyday life. Popper’s propensity theory holds that probabilities are objective claims about the mind-independent external world and that it is possible for there to be single-case probabilities for non-recurring events.

Popper proposes his propensity theory as a variant of the relative frequency theories of probability defended by logical positivists such as Richard von Mises and Hans Reichenbach. According to simple versions of frequency theory, the probability of an event of type e can be defined as the relative frequency of e in a large, or perhaps even infinite, reference class. For example, the claim that the “the probability of getting a six on a fair die is 1/6” can be understood as the claim that, in a long sequence of rolls with a fair die (the reference class), six would come up 1/6 of the time. The main alternatives to frequency theory that concern Popper are logical and subjective theories of probability, according to which claims about probability should be understood as claims about the strength of evidence for or degree of belief in some proposition. On these views, the claim that “the probability of getting a six on a fair die is 1/6” can be understood as a claim about our lack of evidence—if all we know is that the die is fair, then we have no reason to think that any particular number, such as a six, is more likely to come up on the next roll than any of the other five possible numbers.

Like other defenders of frequency theories, Popper argues that logical or subjective theories incorrectly interpret scientific claims about probability as being about the scientific investigators, and the evidence they have available to them, rather than the external world they are investigating. However, Popper argues that traditional frequency theories cannot account for single-case probabilities. For example, a frequency theorist would have no problem answering questions about “the probability that it will rain on an arbitrarily chosen August day,” since August days form a reference class. By contrast, questions about the probability that it will rain on a particular, future August day raises problems, since each particular day only occurs once. At best, frequency theories allow us to say the probability of it raining on that specific day is either 0 or 1, though we do not know which.

On Popper’s view, the failure to provide adequate treatment of single-case probabilities is a serious one, especially given what he saw as the centrality of such probabilities in quantum mechanics. To resolve this issue, Popper proposes that probabilities should be treated as the propensities of experimental setups to produce certain results, rather than as being derived from the reference class of results that were produced by running these experiments. On the propensity view, the results of experiments are important because they allow us to test hypotheses concerning the values of certain probabilities; however, the results are not themselves part of the probability itself. Popper argues that this solves the problem of single-case probability, since propensities can exist even for experiments that only happen once. Importantly, Popper does not require that these experiments utilize human intervention—instead, nature can itself run experiments, the results of which we can observe. For example, the propensity theory should, in theory, be able to make sense of claims about the probability that it will rain on a particular day, even though the experimental setup in this case is constituted by naturally occurring, meteorological phenomena.

Popper argues that the propensity theory of probability helps provide the grounds for a realist solution to the measurement problem within quantum mechanics. As opposed to the Copenhagen interpretation, which posits that the probabilities discussed in quantum mechanics reflect the ignorance of the observers, Popper argues these probabilities are in fact the propensities of the experimental setups to produce certain outcomes. Interpreted this way, he argues that they raise no interesting metaphysical dilemmas beyond those raised by classical mechanics and that they are equally amenable to a realist interpretation. Popper gives the example of tossing a penny, which he argues is strictly analogous to the experiments performed in quantum mechanics: if our experimental setup consists of simply tossing the penny, then the probability of getting heads is 1/2. If the experimental setup, however, is expanded to include the results of our looking at the penny, and thus includes the outcome of the experiment itself, then the probability will be either 0 or 1. This does not, though, involve positing any collapse of the wave-function caused merely by the act of human observation. Instead, what has occurred is simply a change in the experimental setup. Once we include the measurement result in our setup, the probability of a particular outcome will trivially become 0 or 1.

5. Methodology in the Social Sciences

Much of Popper’s early work on the methodology of science is concerned with physics and closely related fields, especially those where experimentation plays a central role. On Popper’s view, which was discussed in detail in previous sections, these sciences make progress by formulating a theory and then carefully designing experiments and observations aimed at falsifying the purported theory. The ever-present possibility that a theory might be falsified by these sorts of tests is, on Popper’s view, precisely what differentiates legitimate sciences, such as physics, from non-scientific activities, such as philosophical metaphysics, Freudian psychoanalysis, or myth-making.

This picture becomes somewhat more complicated, however, when we consider methodology in social sciences such as sociology and economics, where experimentation plays a much less central role. On Popper’s view, there are significant problems with many of the methods used in these disciplines. In particular, Popper argues against what he calls historicism, which he describes as “an approach to the social sciences which assumes that historical prediction is their principal aim, and which assumes that this aim is attainable by discovering the ‘rhythms’ or ‘patterns’, the ‘laws’ or ‘trends’ that underlie the evolution of history” (1957, p. 3).

Popper’s central argument against historicism contends that, insofar as the whole of human history is a singular process that occurs only once, it is impossible to formulate and test any general laws about history. This stands in stark contrast to disciplines such as physics, where the formulation and testing of laws plays a central role in making progress. For example, potential laws of gravitation can be tested by observations of planetary motions, by controlled experiments concerning the rates of falling objects near the earth’s surface, or in numerous other ways. If the relevant theories are falsified, scientists can easily respond, for instance, by changing one or more auxiliary hypotheses, and then conducting additional experiments on the new, slightly modified theory. By contrast, a law that purports to describe the future progress of history in its entirety cannot easily be tested in this way. Even if a particular prediction about the occurrence of some particular event is incorrect, there is no way of altering the theory to retest it—each historical event only occurs one, thus ruling out the possibility of carrying more tests regarding this event. Popper also rejects the claim that it is possible to formulate and test laws of more limited scope, such as those that purport to describe an evolutionary process that occurs in multiple societies, or that attempt to capture a trend within a given society.

Popper’s opposition to historicism is also evident in his objections what he calls utopian social engineering, which involves attempts by governments to fundamentally restructure the whole of society based on an overall plan or blueprint. On Popper’s view, the problem again concerns the impossibility of carrying out critical tests of the effectiveness of such plans. This impossibility is because of the holism of utopian plans, which involve changing everything at the same time. When the planners’ actions fail—as Popper thinks is inevitably the case with human interventions in society—to achieve their predicted results, the planners have no method for determining what in particular went wrong with their plan. This lack of testability, in turn, means that there is no way for the utopian engineers to improve their plans. This argument, among others, plays a central role in Popper’s critique of Marxism and totalitarianism in The Open Society and its Enemies (1945). More details on Popper’s political philosophy, including his critique of totalitarian societies, can be found here.

In place of historicism and utopian holism, Popper argues that the social sciences should embrace both methodological individualism and situational analysis. On Popper’s definition, methodological individualism is the view that the behavior of social institutions should be analyzed in terms of the behaviors of the individual humans that made them up. This individualism is motivated, in part, by Popper’s contention that many important social institutions, such as the market, are not the result of any conscious design but instead arise out of the uncoordinated actions of individuals with widely disparate motives. Scientific hypotheses about the behavior of such unplanned institutions, then, must be formulated in terms of the constituent participants. Popper’s presentation and defense of methodological individualism is closely related to that provided by the Austrian economist Frederich von Hayek (1942, 1943, 1944), with whom Popper maintained close personal and professional relationships throughout most of his life. For both Popper and Hayek, the defense of methodological individualism within the social sciences plays a key role in their broader argument in favor of liberal, market economies and against planned economies.

While Popper endorses methodological individualism, he rejects the doctrine of psychologism, according to which laws about social institutions must be reduced to psychological laws concerning the behavior of individuals. Popper objects to this view, which he associates with John Stuart Mill, on the grounds that it ends up collapsing into a form of historicism. The argument can be summarized as follows: once we begin trying to explain or predict the behavior currently existing in institutions in terms of individuals’ psychological motives, we quickly notice that these motives themselves cannot be understood without reference to the broader social environment within which these individuals find themselves. In order to eliminate the reference to the particular social institutions that make up this environment, we are then forced to demonstrate how these institutions were themselves a product of individual motives that had operated within some other previously existing social environment. This, though, quickly leads to an unsustainable regress, since humans always act within particular social environments, and their motives cannot be understood without reference to these environments. The only way out for the advocate of psychologism is to posit that both the origin and evolution of all human institutions can be explained purely in terms of human psychology. Popper argues that there is no historical support for the idea that there was ever such as an origin of social institutions. He also argues that this is a form of historicism, insofar as it commits us to discovering laws governing the evolution of society as a whole. As such, it inherits all of the problems mentioned previously.

In place of psychologism, Popper endorses a version of methodological individualism based on situational analysis. On this method, we begin by creating abstract models of the social institutions that we wish to investigate, such as markets or political institutions. In keeping with methodological individualism, these models will contain, among other things, representations of individual agents. However, instead of stipulating that these agents will behave according to the laws governing individual human psychology, as psychologism does, we animate the model by assuming that the agents will respond appropriately according to the logic of the situation. Popper calls this constraint on model building within the social sciences the rationality principle.

Popper recognizes that both the rationality principle and the models built on the basis of it are empirically false—after all, real humans often respond to situations in ways that are irrational and inappropriate. Popper also rejects, however, the idea that the rationality principle should be thought of as a methodological principle that is a priori immune to testing, since part of what makes theories in the social sciences testable is the fact that they make definite claims about individual human behavior. Instead, Popper defends the use of the rationality principle in model building on the grounds that is generally good policy to avoid blaming the falsification of a model on the inaccuracies introduced by the rationality principle and that we can learn more if we blame the other assumptions of our situational analysis (1994, p. 177). On Popper’s view, the errors introduced by the rationality principle are generally small ones, since humans are generally rational. More importantly, holding the rationality principle fixed makes it much easier for us to formulate crucial tests of rival theories and to make genuine progress in the social sciences. By contrast, if the rationality principle were relaxed, he argues, there would be almost no substantive constraints on model building.

6. Popper’s Legacy

While few of Popper’s individual claims have escaped criticism, his contributions to philosophy of science are immense. As mentioned earlier, Popper was one of the most important critics of the early logical empiricist program, and the criticisms he leveled against helped shape the future work of both the logical empiricists and their critics. In addition, while his falsification-based approach to scientific methodology is no longer widely accepted within philosophy of science, it played a key role in laying the ground for later work in the field, including that of Kuhn, Lakatos, and Feyerabend, as well as contemporary Bayesianism.  It also plausible that the widespread popularity of falsificationism—both within and outside of the scientific community—has had an important role in reinforcing the image of science as an essentially empirical activity and in highlighting the ways in which genuine scientific work differs from so-called pseudoscience.  Finally, Popper’s work on numerous specialized issues within the philosophy of science—including verisimilitude, quantum mechanics, the propensity theory of probability, and methodological individualism—has continued to influence contemporary researchers.

7. References and Further Reading

Popper Selections (1985) is an excellent introduction to Popper’s writings for the beginner, while The Philosophy of Karl Popper (Schilpp 1974) contains an extensive bibliography of Popper’s work published before the date, together with numerous critical essays and Popper’s responses to these. Finally, Unended Quest (1976) is an expanded version of the “Intellectual Autobiography” from Schilpp (1974), and it provides a helpful, non-technical overview of many of Popper’s main works in his own words.

a. Primary Sources

  • 1945. The Open Society and Its Enemies. 2 volumes. London: Routledge.
  • 1957. The Poverty of Historicism. London: Routledge. Originally published as a series of three articles in Economica 42, 43, and 46 (1944-1945).
  • 1959. The Logic of Scientific Discovery. London: Hutchinson. This is an English translation of Logik der Forschung, Vienna: Springer (1935).
  • 1959. “The Propensity Interpretation of Probability.” The British Journal for the Philosophy of Science 10 (37): 25–42.
  • 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge. Fifth edition 1989.
  • 1970. “Normal Science and Its Dangers.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgravez 51–58
  • 1972. Objective Knowledge: An Evolutionary Approach. Oxford: Clarendon Press. Revised edition 1979.
  • 1974. “Replies to My Critics” and “Intellectual Autobiography.” In: Schilpp, Paul Arthur, ed.
  • 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
  • 1976. Unended Quest. London: Fontana. Revised edition 1984.
  • 1976. “A Note on Verisimilitude.” The British Journal for the Philosophy of Science 27 (2): 147–59.
  • 1978. “Natural Selection and the Emergence of Mind.” Dialectica 32 (3-4): 339–55.
  • 1982. The Open Universe: An Argument for Indeterminism. Edited by W. W. Bartley III. London: Routledge.
  • 1982. Quantum Theory and the Schism in Physics. Edited by W. W. Bartley III. New York: Routledge.
  • 1983. Realism and the Aim of Science. Edited by W. W. Bartley III. New York: Routledge.
  • 1985. Popper Selections. Edited by David W Miller. Princeton: Princeton University Press.
  • 1994. The Myth of the Framework: In Defense of Science and Rationality. Edited by Mark Amadeus Notturno. London: Routledge.
  • 1999. All Life Is Problem Solving. London: Routledge.

b. Secondary Sources

  • Ackermann, Robert John. 1976. The Philosophy of Karl Popper. Amherst: University of Mass. Press.
  • Agassi, Joseph. 2014. Popper and His Popular Critics: Thomas Kuhn, Paul Feyerabend and Imre Lakatos. 2014 edition. New York: Springer.
  • Blaug, Mark. 1992. The Methodology of Economics: Or, How Economists Explain. 2nd edition. New York: Cambridge University Press.
  • Caldwell, Bruce J. 1991. “Clarifying Popper.” Journal of Economic Literature 29 (1): 1–33.
  • Carnap, Rudolf. 1936. “Testability and Meaning.” Philosophy of Science 3 (4): 419–71. Continued in Philosophy of Science 4 (1): 1-40.
  • Carnap, Rudolf. 1995. An Introduction to the Philosophy of Science. New York: Dover. Originally published as Philosophical Foundations of Physics (1966).
  • Carnap, Rudolf.  2003. The Logical Structure of the World and Pseudoproblems in Philosophy. Translated by Rolf A. George. Chicago and La Salle, Ill: Open Court. Originally published in 1928 as Der logische Aufbau der Welt and Scheinprobleme in der Philosophie.
  • Catton, Philip, and Graham MacDonald, eds. 2004. Karl Popper: Critical Appraisals. New York: Routledge.
  • Currie, Gregory, and Alan Musgrave, eds. 1985. Popper and the Human Sciences. Dordrecht: Martinus Nijhoff.
  • Edmonds, David, and John Eidinow. 2002. Wittgenstein’s Poker: The Story of a Ten-Minute Argument Between Two Great Philosophers. Reprint edition. New York: Harper Perennial.
  • Feyerabend, Paul. 1975. Against Method. London; New York: New Left Books. Fourth edition 2010.
  • Fuller, Steve. 2004. Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
  • Gattei, Stefano. 2010. Karl Popper’s Philosophy of Science: Rationality without Foundations. London; New York: Routledge.
  • Grünbaum, Adolf. 1976. “Is Falsifiability the Touchstone of Scientific Rationality? Karl Popper Versus Inductivism.” In Essays in Memory of Imre Lakatos, edited by R. S. Cohen, P. K. Feyerabend, and M. W. Wartofsky, 213–52. Dordrecht: Springer Netherlands.
  • Hacking, Ian. 1983. Representing and Intervening: Introductory Topics in the Philosophy of Natural Science. Cambridge; New York: Cambridge University Press.
  • Hacohen, Malachi Haim. 2002. Karl Popper: The Formative Years, 1902-1945 : Politics and Philosophy in Interwar Vienna. Cambridge: Cambridge University Press.
  • Hands, Douglas W. 1985. “Karl Popper and Economic Methodology: A New Look.” Economics and Philosophy 1 (1): 83–99.
  • Harris, John H. 1974. “Popper’s Definitions of ‘Verisimilitude.’” The British Journal for the Philosophy of Science 25 (2): 160–66.
  • Hausman, Daniel M. 1985. “Is Falsificationism Unpractised or Unpractisable?” Philosophy of the Social Sciences 15 (3): 313–19.
  • Hayek, Frederich von. 1942. “Scientism and the Study of Society. Part I.” Economica, New Series, 9 (35): 267–91.
  • Hayek, Frederich von.  1943. “Scientism and the Study of Society. Part II.” Economica, New Series, 10 (37): 34–63.
  • Hayek, Frederich von. 1944. “Scientism and the Study of Society. Part III.” Economica, New Series, 11 (41): 27–39.
  • Hempel, Carl G. 1945a. “Studies in the Logic of Confirmation (I.).” Mind, New Series, 54 (213): 1–26.
  • Hempel, Carl G. 1945b. “Studies in the Logic of Confirmation (II.).” Mind, New Series, 54 (214): 97–121.
  • Howson, Colin. 1984a. “Popper’s Solution to the Problem of Induction.” The Philosophical Quarterly 34 (135): 143–47.
  • Howson, Colin. 1984b. “Probabilities, Propensities, and Chances.” Erkenntnis 21 (3): 279–93.
  • Howson, Colin, and Peter Urbach. 1989. Scientific Reasoning: The Bayesian Approach. Chicago: Open Court Publishing. Third edition 2006.
  • Hudelson, Richard. 1980. “Popper’s Critique of Marx.” Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 37 (3): 259–70.
  • Hume, David. 1993. An Enquiry Concerning Human Understanding: With Hume’s Abstract of A Treatise of Human Nature and A Letter from a Gentleman to His Friend in Edinburgh. Edited by Eric Steinberg. 2nd ed. Indianapolis: Hackett Publishing Company, Inc.
  • Jeffrey, Richard C. 1975. “Probability and Falsification: Critique of the Popper Program.” Synthese 30 (1/2): 95–117.
  • Keuth, Herbert. 2004. The Philosophy of Karl Popper. New York: Cambridge University Press.
  • Kuhn, Thomas S. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Third edition 1996.
  • Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgrave, 91–196. Cambridge: Cambridge University Press.
  • Lakatos, Imre.  1980. The Methodology of Scientific Research Programmes: Volume 1: Philosophical Papers. Cambridge University Press.
  • Lakatos, Imre, and Alan Musgrave, eds. 1970. Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press.
  • Levi, Isaac. 1963. “Corroboration and Rules of Acceptance.” The British Journal for the Philosophy of Science 13 (52): 307–13.
  • Maher, Patrick. 1990. “Why Scientists Gather Evidence.” The British Journal for the Philosophy of Science 41 (1): 103-119.
  • Magee, Bryan. 1985. Philosophy and the Real World: An Introduction to Karl Popper. La Salle, Ill: Open Court.
  • Miller, David. 1974. “Popper’s Qualitative Theory of Verisimilitude.” British Journal for the Philosophy of Science, 166–77.
  • Miller, David. 1998. Critical Rationalism: A Restatement and Defense. Chicago: Open Court.
  • Munz, Peter. 1985. Our Knowledge of the Growth of Knowledge: Popper or Wittgenstein?. London; New York: Routledge.
  • O’Hear, Anthony. 1996. Karl Popper: Philosophy and Problems. Cambridge ; New York: Cambridge University Press.
  • Putnam, Hilary. 1974. “The ‘corroboration’ of Theories.” In The Philosophy of Karl Popper, edited by Paul Arthur Schilpp, 221–40. La Salle, Ill: Open Court.
  • Rowbottom, Darrell. 2010. Popper’s Critical Rationalism: A Philosophical Investigation. New York: Routledge.
  • Runde, Jochen. 1996. “On Popper, Probabilities, and Propensities.” Review of Social Economy 54 (4): 465–85.
  • Ruse, Michael. 1977. “Karl Popper’s Philosophy of Biology.” Philosophy of Science 44 (4): 638–61.
  • Salmon, Wesley. 1967. The Foundations of Scientific Inference. Pittsburgh: University of Pittsburgh Press.
  • Salmon, Wesley. 1981. “Rational Prediction.” The British Journal for the Philosophy of Science 32 (2): 115–25.
  • Schilpp, Paul Arthur, ed. 1974. The Philosophy of Karl Popper. 2 volumes. La Salle, Ill: Open Court.
  • Thornton, Stephen. 2014. “Karl Popper.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.
  • Tichý, Pavel. 1974. “On Popper’s Definitions of Verisimilitude.” The British Journal for the Philosophy of Science 25 (2): 155–60.
  • Urbach, Peter. 1978. “Is Any of Popper’s Arguments against Historicism Valid?” The British Journal for the Philosophy of Science 29 (2): 117–30.

 

Author Information

Brendan Shea
Email: Brendan.Shea@rctc.edu
Rochester Community and Technical College, Minnesota Center for Philosophy of Science
U. S. A.

Albert Camus (1913—1960)

CamusAlbert Camus was a French-Algerian journalist, playwright, novelist, philosophical essayist, and Nobel laureate. Though he was neither by advanced training nor profession a philosopher, he nevertheless made important, forceful contributions to a wide range of issues in moral philosophy in his novels, reviews, articles, essays, and speeches—from terrorism and political violence to suicide and the death penalty. He is often described as an existentialist writer, though he himself disavowed the label. He began his literary career as a political journalist and as an actor, director, and playwright in his native Algeria. Later, while living in occupied France during WWII, he became active in the Resistance and from 1944-47 served as editor-in-chief of the newspaper Combat.  By mid-century, based on the strength of his three novels (The Stranger, The Plague, and The Fall) and two book-length philosophical essays (The Myth of Sisyphus and The Rebel), he had achieved an international reputation and readership. It was in these works that he introduced and developed the twin philosophical ideas—the concept of the Absurd and the notion of Revolt—that made him famous. These are the ideas that people immediately think of when they hear the name Albert Camus spoken today. The Absurd can be defined as a metaphysical tension or opposition that results from the presence of human consciousness—with its ever-pressing demand for order and meaning in life—in an essentially meaningless and indifferent universe. Camus considered the Absurd to be a fundamental and even defining characteristic of the modern human condition. The notion of Revolt refers to both a path of resolved action and a state of mind. It can take extreme forms such as terrorism or a reckless and unrestrained egoism (both of which are rejected by Camus), but basically, and in simple terms, it consists of an attitude of heroic defiance or resistance to whatever oppresses human beings. In awarding Camus its prize for literature in 1957, the Nobel Prize committee cited his persistent efforts to “illuminate the problem of the human conscience in our time.” He was honored by his own generation, and is still admired today, for being a writer of conscience and a champion of imaginative literature as a vehicle of philosophical insight and moral truth. He was at the height of his career—at work on an autobiographical novel, planning new projects for theatre, film, and television, and still seeking a solution to the lacerating political turmoil in his homeland—when he died tragically in an automobile accident in January 1960.

Table of Contents

  1. Life
  2. Literary Career
  3. Camus, Philosophical Literature, and the Novel of Ideas
  4. Works
    1. Fiction
    2. Drama
    3. Essays, Letters, Prose Collections, Articles, and Reviews
  5. Philosophy
    1. Background and Influences
    2. Development
    3. Themes and Ideas
      1. The Absurd
      2. Revolt
      3. The Outsider
      4. Guilt and Innocence
      5. Christianity vs. “Paganism”
      6. Individual vs. History and Mass Culture
      7. Suicide
      8. The Death Penalty
  6. Existentialism
  7. Camus, Colonialism, and Algeria
  8. Significance and Legacy
  9. References and Further Reading
    1. Works by Albert Camus
    2. Critical and Biographical Studies

1. Life

Albert Camus was born on November 7, 1913, in Mondovi, a small village near the seaport city of Bonê (present-day Annaba) in the northeast region of French Algeria. He was the second child of Lucien Auguste Camus, a military veteran and wine-shipping clerk, and of Catherine Helene (Sintes) Camus, a house-keeper and part-time factory worker. (Note: Although Camus believed that his father was Alsatian and a first-generation émigré, research by biographer Herbert Lottman indicates that the Camus family was originally from Bordeaux and that the first Camus to leave France for Algeria was actually the author’s great-grandfather, who in the early 19th century became part of the first wave of European colonial settlers in the new melting pot of North Africa.)

Shortly after the outbreak of WWI, when Camus was less than a year old, his father was recalled to military service and, on October 11, 1914, died of shrapnel wounds suffered at the first battle of the Marne. As a child, about the only thing Camus ever learned about his father was that he had once become violently ill after witnessing a public execution. This anecdote, which surfaces in fictional form in the author’s novel The Stranger and is also recounted in his philosophical essay “Reflections on the Guillotine,” strongly affected Camus and influenced his lifelong opposition to the death penalty.

After his father’s death, Camus, his mother, and his older brother moved to Algiers where they lived with his maternal uncle and grandmother in her cramped second-floor apartment in the working-class district of Belcourt. Camus’s mother Catherine, who was illiterate, partially deaf, and afflicted with a speech pathology, worked in an ammunition factory and cleaned homes to help support the family. In his posthumously published autobiographical novel The First Man, Camus recalls this period of his life with a mixture of pain and affection as he describes conditions of harsh poverty (the three-room apartment had no bathroom, no electricity, and no running water) relieved by hunting trips, family outings, childhood games, and scenic flashes of sun, seashore, mountain, and desert.

Camus attended elementary school at the local Ecole Communale, and it was there that he encountered the first in a series of teacher-mentors who recognized and nurtured the young boy’s lively intelligence. These father figures introduced him to a new world of history and imagination and to literary landscapes far beyond the dusty streets of Belcourt and working-class poverty. Though stigmatized as a pupille de la nation (that is, a war veteran’s child dependent on public welfare) and hampered by recurrent health issues, Camus distinguished himself as a student and was eventually awarded a scholarship to attend high school at the Grand Lycee. Located near the famous Kasbah district, the school brought him into close proximity with the native Muslim community and thus gave him an early recognition of the idea of the “outsider” that would dominate his later writings.

It was in secondary school that Camus became an avid reader (absorbing Gide, Proust, Verlaine, and Bergson, among others), learned Latin and English, and developed a lifelong interest in literature, art, theatre, and film. He also enjoyed sports, especially soccer, of which he once wrote (recalling his early experience as a goal-keeper): “I learned . . . that a ball never arrives from the direction you expected it. That helped me in later life, especially in mainland France, where nobody plays straight.” It was also during this period that Camus suffered his first serious attack of tuberculosis, a disease that was to afflict him, on and off, throughout his career.

By the time he finished his Baccalauréat degree in June 1932, Camus was already contributing articles to Sud, a literary monthly, and looking forward to a career in journalism, the arts, or higher education. The next four years (1933-37) were an especially busy period in his life during which he attended college, worked at odd jobs, married his first wife (Simone Hié), divorced, briefly joined the Communist party, and effectively began his professional theatrical and writing career. Among his various employments during the time were stints of routine office work where one job consisted of a Bartleby-like recording and sifting of meteorological data and another involved paper shuffling in an auto license bureau. One can well imagine that it was as a result of this experience that his famous conception of Sisyphean struggle, heroic defiance in the face of the Absurd, first began to take shape within his imagination.

In 1933, Camus enrolled at the University of Algiers to pursue his diplome d’etudes superieures, specializing in philosophy and gaining certificates in sociology and psychology along the way. In 1936, he became a co-founder, along with a group of young fellow intellectuals, of the Théâtre du Travail, a professional acting company specializing in drama with left-wing political themes. Camus served the company as both an actor and director and also contributed scripts, including his first published play Revolt in Asturia, a drama based on an ill-fated workers’ revolt during the Spanish Civil War. That same year Camus also earned his degree and completed his dissertation, a study of the influence of Plotinus and neo-Platonism on the thought and writings of St. Augustine.

Over the next three years Camus further established himself as an emerging author, journalist, and theatre professional. After his disillusionment with and eventual expulsion from the Communist Party, he reorganized his dramatic company and renamed it the Théâtre de l’Equipe (literally the Theater of the Team). The name change signaled a new emphasis on classic drama and avant-garde aesthetics and a shift away from labor politics and agitprop. In 1938 he joined the staff of a new daily newspaper, the Alger Républicain, where his assignments as a reporter and reviewer covered everything from contemporary European literature to local political trials. It was during this period that he also published his first two literary works—Betwixt and Between, a collection of five short semi-autobiographical and philosophical pieces (1937) and Nuptials, a series of lyrical celebrations interspersed with political and philosophical reflections on North Africa and the Mediterranean.

The 1940s witnessed Camus’s gradual ascendance to the rank of world-class literary intellectual. He started the decade as a locally acclaimed author and playwright, but he was a figure virtually unknown outside the city of Algiers; however, he ended the decade as an internationally recognized novelist, dramatist, journalist, philosophical essayist, and champion of freedom. This period of his life began inauspiciously—war in Europe, the occupation of France, official censorship, and a widening crackdown on left-wing journals. Camus was still without stable employment or steady income when, after marrying his second wife, Francine Faure, in December of 1940, he departed Lyons, where he had been working as a journalist, and returned to Algeria. To help make ends meet, he taught part-time (French history and geography) at a private school in Oran. All the while he was putting finishing touches to his first novel The Stranger, which was finally published in 1942 to favorable critical response, including a lengthy and penetrating review by Jean-Paul Sartre. The novel propelled him into immediate literary renown.

Camus returned to France in 1942 and a year later began working for the clandestine newspaper Combat, the journalistic arm and voice of the French Resistance movement. During this period, while contending with recurrent bouts of tuberculosis, he also published The Myth of Sisyphus, his philosophical anatomy of suicide and the absurd, and joined Gallimard Publishing as an editor, a position he held until his death.

After the Liberation, Camus continued as editor of Combat, oversaw the production and publication of two plays, The Misunderstanding and Caligula, and assumed a leading role in Parisian intellectual society in the company of Sartre and Simone de Beauvoir among others. In the late 40s his growing reputation as a writer and thinker was enlarged by the publication of The Plague, an allegorical novel and fictional parable of the Nazi Occupation and the duty of revolt, and by the lecture tours to the United States and South America. In 1951 he published The Rebel, a reflection on the nature of freedom and rebellion and a philosophical critique of revolutionary violence. This powerful and controversial work, with its explicit condemnation of Marxism-Leninism and its emphatic denunciation of unrestrained violence as a means of human liberation, led to an eventual falling out with Sartre and, along with his opposition to the Algerian National Liberation Front, to his being branded a reactionary in the view of many European Communists. Yet his position also established him as an outspoken champion of individual freedom and as an impassioned critic of tyranny and terrorism, whether practiced by the Left or by the Right.

In 1956, Camus published the short, confessional novel The Fall, which unfortunately would be the last of his completed major works and which in the opinion of some critics is the most elegant, and most under-rated of all his books. During this period he was still afflicted by tuberculosis and was perhaps even more sorely beset by the deteriorating political situation in his native Algeria—which had by now escalated from demonstrations and occasional terrorist and guerilla attacks into open violence and insurrection. Camus still hoped to champion some kind of rapprochement that would allow the native Muslim population and the French pied noir minority to live together peaceably in a new de-colonized and largely integrated, if not fully independent, nation. Alas, by this point, as he painfully realized, the odds of such an outcome were becoming increasingly unlikely.

In the fall of 1957, following publication of Exile and the Kingdom, a collection of short fiction, Camus was shocked by news that he had been awarded the Nobel Prize for literature. He absorbed the announcement with mixed feelings of gratitude, humility, and amazement. On the one hand, the award was obviously a tremendous honor. On the other, not only did he feel that his friend and esteemed fellow novelist Andre Malraux was more deserving, he was also aware that the Nobel itself was widely regarded as the kind of accolade usually given to artists at the end of a long career. Yet, as he indicated in his acceptance speech at Stockholm, he considered his own career as still in mid-flight, with much yet to accomplish and even greater writing challenges ahead:

Every person, and assuredly every artist, wants to be recognized. So do I. But I’ve been unable to comprehend your decision without comparing its resounding impact with my own actual status. A man almost young, rich only in his doubts, and with his work still in progress…how could such a man not feel a kind of panic at hearing a decree that transports him all of a sudden…to the center of a glaring spotlight? And with what feelings could he accept this honor at a time when other writers in Europe, among them the very greatest, are condemned to silence, and even at a time when the country of his birth is going through unending misery?

Of course Camus could not have known as he spoke these words that most of his writing career was in fact behind him. Over the next two years, he published articles and continued to write, produce, and direct plays, including his own adaptation of Dostoyevsky’s The Possessed. He also formulated new concepts for film and television, assumed a leadership role in a new experimental national theater, and continued to campaign for peace and a political solution in Algeria. Unfortunately, none of these latter projects would be brought to fulfillment. On January 4, 1960, Camus died tragically in a car accident while he was a passenger in a vehicle driven by his friend and publisher Michel Gallimard, who also suffered fatal injuries. The author was buried in the local cemetery at Lourmarin, a village in Provencal where he and his wife and daughters had lived for nearly a decade.

Upon hearing of Camus’s death, Sartre wrote a moving eulogy in the France-Observateur, saluting his former friend and political adversary not only for his distinguished contributions to French literature but especially for the heroic moral courage and “stubborn humanism” which he brought to bear against the “massive and deformed events of the day.”

2. Literary Career

According to Sartre’s perceptive appraisal, Camus was less a novelist and more a writer of philosophical tales and parables in the tradition of Voltaire. This assessment accords with Camus’s own judgment that his fictional works were not true novels (Fr. romans), a form he associated with the densely populated and richly detailed social panoramas of writers like Balzac, Tolstoy, and Proust, but rather contes (“tales”) and recits (“narratives”) combining philosophical and psychological insights.

In this respect, it is also worth noting that at no time in his career did Camus ever describe himself as a deep thinker or lay claim to the title of philosopher. Instead, he nearly always referred to himself simply, yet proudly, as un ecrivain—a writer. This is an important fact to keep in mind when assessing his place in intellectual history and in twentieth-century philosophy, for by no means does he qualify as a system-builder or theorist or even as a disciplined thinker. He was instead (and here again Sartre’s assessment is astute) a sort of all-purpose critic and modern-day philosophe: a debunker of mythologies, a critic of fraud and superstition, an enemy of terror, a voice of reason and compassion, and an outspoken defender of freedom—all in all a figure very much in the Enlightenment tradition of Voltaire and Diderot. For this reason, in assessing Camus’s career and work, it may be best simply to take him at his own word and characterize him first and foremost as a writer—advisedly attaching the epithet “philosophical” for sharper accuracy and definition.

3. Camus, Philosophical Literature, and the Novel of Ideas

To pin down exactly why and in what distinctive sense Camus may be termed a philosophical writer, we can begin by comparing him with other authors who have merited the designation. Right away, we can eliminate any comparison with the efforts of Lucretius and Dante, who undertook to unfold entire cosmologies and philosophical systems in epic verse. Camus obviously attempted nothing of the sort. On the other hand, we can draw at least a limited comparison between Camus and writers like Pascal, Kierkegaard, and Nietzsche—that is, with writers who were first of all philosophers or religious writers, but whose stylistic achievements and literary flair gained them a special place in the pantheon of world literature as well. Here we may note that Camus himself was very conscious of his debt to Kierkegaard and Nietzsche (especially in the style and structure of The Myth of Sisyphus and The Rebel) and that he might very well have followed in their literary-philosophical footsteps if his tuberculosis had not side-tracked him into fiction and journalism and prevented him from pursuing an academic career.

Perhaps Camus himself best defined his own particular status as a philosophical writer when he wrote (with authors like Melville, Stendhal, Dostoyevsky, and Kafka especially in mind): “The great novelists are philosophical novelists”; that is, writers who eschew systematic explanation and create their discourse using “images instead of arguments” (The Myth of Sisyphus 74).

By his own definition then Camus is a philosophical writer in the sense that he has (a) conceived his own distinctive and original world-view and (b) sought to convey that view mainly through images, fictional characters and events, and via dramatic presentation rather than through critical analysis and direct discourse. He is also both a novelist of ideas and a psychological novelist, and in this respect, he certainly compares most closely to Dostoyevsky and Sartre, two other writers who combine a unique and distinctly philosophical outlook, acute psychological insight, and a dramatic style of presentation. (Like Camus, Sartre was a productive playwright, and Dostoyevsky remains perhaps the most dramatic of all novelists, as Camus clearly understood, having adapted both The Brothers Karamazov and The Possessed for the stage.)

4. Works

Camus’s reputation rests largely on the three novels published during his lifetime—The Stranger, The Plague, and The Fall—and on his two major philosophical essays—The Myth of Sisyphus and The Rebel. However, his body of work also includes a collection of short fiction, Exile and the Kingdom; an autobiographical novel, The First Man; a number of dramatic works, most notably Caligula, The Misunderstanding, The State of Siege, and The Just Assassins; several translations and adaptations, including new versions of works by Calderon, Lope de Vega, Dostoyevsky, and Faulkner; and a lengthy assortment of essays, prose pieces, critical reviews, transcribed speeches and interviews, articles, and works of journalism. A brief summary and description of the most important of Camus’s writings is presented below as preparation for a larger discussion of his philosophy and world-view, including his main ideas and recurrent philosophical themes.

a. Fiction

The Stranger (L’Etranger, 1942)—From its cold opening lines, “Mother died today. Or maybe yesterday; I can’t be sure,” to its bleak concluding image of a public execution set to take place beneath the “benign indifference of the universe,” Camus’s first and most famous novel takes the form of a terse, flat, first-person narrative by its main character Meursault, a very ordinary young man of unremarkable habits and unemotional affect who, inexplicably and in an almost absent-minded way, kills an Arab and then is arrested, tried, convicted, and sentenced to death. The neutral style of the novel—typical of what the critic Roland Barthes called “writing degree zero”—serves as a perfect vehicle for the descriptions and commentary of its anti-hero narrator, the ultimate “outsider” and a person who seems to observe everything, including his own life, with almost pathological detachment.

The Plague (La Peste, 1947)—Set in the coastal town of Oran, Camus’s second novel is the story of an outbreak of plague, traced from its subtle, insidious, unheeded beginnings and horrible, seemingly irresistible dominion to its eventual climax and decline, all told from the viewpoint of one of the survivors. Camus made no effort to conceal the fact that his novel was partly based on and could be interpreted as an allegory or parable of the rise of Nazism and the nightmare of the Occupation. However, the plague metaphor is both more complicated and more flexible than that, extending to signify the Absurd in general as well as any calamity or disaster that tests the mettle of human beings, their endurance, their solidarity, their sense of responsibility, their compassion, and their will. At the end of the novel, the plague finally retreats, and the narrator reflects that a time of pestilence teaches “that there is more to admire in men than to despise,” but he also knows “that the plague bacillus never dies or disappears for good,” that “the day would come when, for the bane and the enlightening of men, it would rouse up its rats again” and send them forth yet once more to spread death and contagion into a happy and unsuspecting city.

The Fall (La Chute, 1956)—Camus’s third novel, and the last to be published during his lifetime, is in effect an extended dramatic monologue spoken by M. Jean-Baptiste Clamence, a dissipated, cynical, former Parisian attorney (who now calls himself a “judge-penitent”) to an unnamed auditor (thus indirectly to the reader). Set in a seedy bar in the red-light district of Amsterdam, the work is a small masterpiece of compression and style: a confessional (and semi-autobiographical) novel, an arresting character study and psychological portrait, and at the same time a wide-ranging philosophical discourse on guilt and innocence, expiation and punishment, good and evil.

b. Drama

Camus began his literary career as a playwright and theatre director and was planning new dramatic works for film, stage, and television at the time of his death. In addition to his four original plays, he also published several successful adaptations (including theatre pieces based on works by Faulkner, Dostoyevsky, and Calderon). He took particular pride in his work as a dramatist and man of the theatre. However, his plays never achieved the same popularity, critical success, or level of incandescence as his more famous novels and major essays.

Caligula (1938, first produced 1945)—“Men die and are not happy.” Such is the complaint against the universe pronounced by the young emperor Caligula, who in Camus’s play is less the murderous lunatic, slave to incest, narcissist, and megalomaniac of Roman history than a theatrical martyr-hero of the Absurd: a man who carries his philosophical quarrel with the meaninglessness of human existence to a kind of fanatical but logical extreme. Camus described his hero as a man “obsessed with the impossible” willing to pervert all values, and if necessary destroy himself and all those around him in the pursuit of absolute liberty. Caligula was Camus’s first attempt at portraying a figure in absolute defiance of the Absurd, and through three revisions of the play over a period of several years he eventually achieved a remarkable composite by adding to Caligula’s original portrait touches of Sade, of revolutionary nihilism, of the Nietzschean Superman, of his own version of Sisyphus, and even of Mussolini and Hitler.

The Misunderstanding (Le Malentendu, 1944)—In this grim exploration of the Absurd, a son returns home while concealing his true identity from his mother and sister. The two women operate a boarding house where, in order to make ends meet, they quietly murder and rob their patrons. Through a tangle of misunderstanding and mistaken identity they wind up murdering their unrecognized visitor. Camus has explained the drama as an attempt to capture the atmosphere of malaise, corruption, demoralization, and anonymity that he experienced while living in France during the German occupation. Despite the play’s dark themes and bleak style, he described its philosophy as ultimately optimistic: “It amounts to saying that in an unjust or indifferent world man can save himself, and save others, by practicing the most basic sincerity and pronouncing the most appropriate word.”

State of Siege (L’Etat de Siege, 1948)This odd allegorical drama combines features of the medieval morality play with elements of Calderon and the Spanish baroque; it also has apocalyptic themes, bits of music hall comedy, and a collection of avant-garde theatrics thrown in for good measure. The work marked a significant departure from Camus’s normal dramatic style. It also resulted in virtually universal disapproval and negative reviews from Paris theatre-goers and critics, many of whom came expecting a play based on Camus’s recent novel The Plague. The play is set in the Spanish seaport city of Cadiz, famous for its beaches, carnivals, and street musicians. By the end of the first act, the normally laid-back and carefree citizens fall under the dominion of a gaudily beribboned and uniformed dictator named Plague (based on Generalissimo Franco) and his officious, clip-board wielding Secretary (who turns out to be a modern, bureaucratic incarnation of the medieval figure Death). One of the prominent concerns of the play is the Orwellian theme of the degradation of language via totalitarian politics and bureaucracy (symbolized onstage by calls for silence, scenes in pantomime, and a gagged chorus). As one character observes, “we are steadily nearing that perfect moment when nothing anybody says will rouse the least echo in another’s mind.”

The Just Assassins (Les Justes, 1950)—First performed in Paris to largely favorable reviews, this play is based on real-life characters and an actual historical event: the 1905 assassination of the Russian Grand Duke Sergei Alexandrovich by Ivan Kalyayev and fellow members of the Combat Organization of the Socialist Revolutionary Party. The play effectively dramatizes the issues that Camus would later explore in detail in The Rebel, especially the question of whether acts of terrorism and political violence can ever be morally justified (and if so, with what limitations and in what specific circumstances). The historical Kalyayev passed up his original opportunity to bomb the Grand Duke’s carriage because the Duke was accompanied by his wife and two young nephews. However, this was no act of conscience on Kalyayev’s part but a purely practical decision based on his calculation that the murder of children would prove a setback to the revolution. After the successful completion of his bombing mission and subsequent arrest, Kalyayev welcomed his execution on similarly practical and purely political grounds, believing that his death would further the cause of revolution and social justice. Camus’s Kalyayev, on the other hand, is a far more agonized and conscientious figure, neither so cold-blooded nor so calculating as his real-life counterpart. Upon seeing the two children in the carriage, he refuses to toss his bomb not because doing so would be politically inexpedient but because he is overcome emotionally, temporarily unnerved by the sad expression in their eyes.  Similarly, at the end of the play he embraces his death not so much because it will aid the revolution, but almost as a form of karmic penance, as if it were indeed some kind of sacred duty or metaphysical requirement that must be performed in order for true justice to be achieved.

c. Essays, Letters, Prose Collections, Articles, and Reviews

Betwixt and Between (L’Envers et l’endroit, 1937)—This short collection of semi-autobiographical, semi-fictional, philosophical pieces might be dismissed as juvenilia and largely ignored if it were not for the fact that it represents Camus’s first attempt to formulate a coherent life-outlook and world-view. The collection, which in a way serves as a germ or starting point for the author’s later philosophy, consists of five lyrical essays. In “Irony” (“L’Ironie”), a reflection on youth and age, Camus asserts, in the manner of a young disciple of Pascal, our essential solitariness in life and death. In “Between yes and no” (“Entre Oui et Non”) he suggests that to hope is as empty and as pointless as to despair, yet he goes beyond nihilism by positing a fundamental value to existence-in-the-world. In “Death in the soul” (“La Mort dans l’ame”) he supplies a sort of existential travel review, contrasting his impressions of central and Eastern Europe (which he views as purgatorial and morgue-like) with the more spontaneous life of Italy and Mediterranean culture. The piece thus affirms the author’s lifelong preference for the color and vitality of the Mediterranean world, and especially North Africa, as opposed to what he perceives as the soulless cold-heartedness of modern Europe. In “Love of life” (“Amour de vivre”) he claims there can be no love of life without despair of life and thus largely re-asserts the essentially tragic, ancient Greek view that the very beauty of human existence is largely contingent upon its brevity and fragility. The concluding essay, “Betwixt and between” (“L’Envers et l’endroit”), summarizes and re-emphasizes the Romantic themes of the collection as a whole: our fundamental “aloneness,” the importance of imagination and openness to experience, the imperative to “live as if….”

Nuptials (Noces, 1938)—This collection of four rhapsodic narratives supplements and amplifies the youthful philosophy expressed in Betwixt and Between. That joy is necessarily intertwined with despair, that the shortness of life confers a premium on intense experience, and that the world is both beautiful and violent—these are, once again, Camus’s principal themes. “Summer in Algiers,” which is probably the best (and best-known) of the essays in the collection, is a lyrical, at times almost ecstatic, celebration of sea, sun, and the North African landscape. Affirming a defiantly atheistic creed, Camus concludes with one of the core ideas of his philosophy: “If there is a sin against life, it consists not so much in despairing as in hoping for another life and in eluding the implacable grandeur of this one.”

The Myth of Sisyphus (Le Mythe de Sisyphe, 1943)—If there is a single non-fiction work that can be considered an essential or fundamental statement of Camus’s philosophy, it is this extended essay on the ethics of suicide (eventually translated and repackaged for American publication in 1955). It is here that Camus formally introduces and fully articulates his most famous idea, the concept of the Absurd, and his equally famous image of life as a Sisyphean struggle. From its provocative opening sentence—“There is but one truly serious philosophical problem, and that is suicide”—to its stirring, paradoxical conclusion—“The struggle itself toward the heights is enough to fill a man’s heart. One must imagine Sisyphus happy”—the book has something interesting and challenging on nearly every page and is shot through with brilliant aphorisms and insights. In the end, Camus rejects suicide: the Absurd must not be evaded either by religion (“philosophical suicide”) or by annihilation (“physical suicide”); the task of living should not merely be accepted, it must be embraced.

The Rebel (L’Homme Revolte, 1951)—Camus considered this work a continuation of the critical and philosophical investigation of the Absurd that he began with The Myth of Sisyphus. Only this time his primary concern is not suicide but murder. He takes up the question of whether acts of terrorism and political violence can be morally justified, which is basically the same question he had addressed earlier in his play The Just Assassins. After arguing that an authentic life inevitably involves some form of conscientious moral revolt, Camus winds up concluding that only in rare and very narrowly defined instances is political violence justified. Camus’s critique of revolutionary violence and terror in this work, and particularly his caustic assessment of Marxism-Leninism (which he accused of sacrificing innocent lives on the altar of History), touched nerves throughout Europe and led in part to his celebrated feud with Sartre and other French leftists.

Resistance, Rebellion, and Death (1960)—This posthumous collection is of interest to students of Camus mainly because it brings together an unusual assortment of his non-fiction writings on a wide range of topics, from art and politics to the advantages of pessimism and the virtues (from a non-believer’s standpoint) of Christianity. Of special interest are two pieces that helped secure Camus’s worldwide reputation as a voice of liberty: “Letters to a German Friend,” a set of four letters originally written during the Nazi Occupation, and “Reflections on the Guillotine,” a denunciation of the death penalty cited for special mention by the Nobel committee and eventually revised and re-published as a companion essay to go with fellow death-penalty opponent Arthur Koestler’s “Reflections on Hanging.”

5. Philosophy

To re-emphasize a point made earlier, Camus considered himself first and foremost a writer (un ecrivain). Indeed, Camus’s dissertation advisor penciled onto his dissertation the assessment “More a writer than a philosopher.” And at various times in his career he also accepted the labels journalist, humanist, novelist, and even moralist. However, he apparently never felt comfortable identifying himself as a philosopher—a term he seems to have associated with rigorous academic training, systematic thinking, logical consistency, and a coherent, carefully defined doctrine or body of ideas.

This is not to suggest that Camus lacked ideas or to say that his thought cannot be considered a personal philosophy. It is simply to point out that he was not a systematic, or even a notably disciplined thinker and that, unlike Heidegger and Sartre, for example, he showed very little interest in metaphysics and ontology, which seems to be one of the reasons he consistently denied that he was an existentialist. In short, he was not much given to speculative philosophy or any kind of abstract theorizing. His thought is instead nearly always related to current events (e.g., the Spanish War, revolt in Algeria) and is consistently grounded in down-to-earth moral and political reality.

a. Background and Influences

Though he was baptized, raised, and educated as a Catholic and invariably respectful towards the Church, Camus seems to have been a natural-born pagan who showed almost no instinct whatsoever for belief in the supernatural. Even as a youth, he was more of a sun-worshipper and nature lover than a boy notable for his piety or religious faith. On the other hand, there is no denying that Christian literature and philosophy served as an important influence on his early thought and intellectual development. As a young high school student, Camus studied the Bible, read and savored the Spanish mystics St. Theresa of Avila and St. John of the Cross, and was introduced to the thought of St. Augustine St. Augustine would later serve as the subject of his baccalaureate dissertation and become—as a fellow North African writer, quasi-existentialist, and conscientious observer-critic of his own life—an important lifelong influence.

In college Camus absorbed Kierkegaard, who, after Augustine, was probably the single greatest Christian influence on his thought. He also studied Schopenhauer and Nietzsche—undoubtedly the two writers who did the most to set him on his own path of defiant pessimism and atheism. Other notable influences include not only the major modern philosophers from the academic curriculum—from Descartes and Spinoza to Bergson—but also, and just as importantly, philosophical writers like Stendhal, Melville, Dostoyevsky, and Kafka.

b. Development

The two earliest expressions of Camus’s personal philosophy are his works Betwixt and Between (1937) and Nuptials (1938). Here he unfolds what is essentially a hedonistic, indeed almost primitivistic, celebration of nature and the life of the senses. In the Romantic poetic tradition of writers like Rilke and Wallace Stevens, he offers a forceful rejection of all hereafters and an emphatic embrace of the here and now. There is no salvation, he argues, no transcendence; there is only the enjoyment of consciousness and natural being. One life, this life, is enough. Sky and sea, mountain and desert, have their own beauty and magnificence and constitute a sufficient heaven.

The critic John Cruikshank termed this stage in Camus’s thinking “naïve atheism” and attributed it to his ecstatic and somewhat immature “Mediterraneanism.” Naïve seems an apt characterization for a philosophy that is romantically bold and uncomplicated yet somewhat lacking in sophistication and logical clarity. On the other hand, if we keep in mind Camus’s theatrical background and preference for dramatic presentation, there may actually be more depth and complexity to his thought here than meets the eye. That is to say, just as it would be simplistic and reductive to equate Camus’s philosophy of revolt with that of his character Caligula (who is at best a kind of extreme or mad spokesperson for the author), so in the same way it is possible that the pensées and opinions presented in Nuptials and Betwixt and Between are not so much the views of Camus as they are poetically heightened observations of an artfully crafted narrator—an exuberant alter ego who is far more spontaneous and free-spirited than his more naturally reserved and sober-minded author.

In any case, regardless of this assessment of the ideas expressed in Betwixt and Between and Nuptials, it is clear that these early writings represent an important, if comparatively raw and simple, beginning stage in Camus’s development as a thinker where his views differ markedly from his more mature philosophy in several noteworthy respects. In the first place, the Camus of Nuptials is still a young man of twenty-five, aflame with youthful joie de vivre. He favors a life of impulse and daring as it was honored and practiced in both Romantic literature and in the streets of Belcourt. Recently married and divorced, raised in poverty and in close quarters, beset with health problems, this young man develops an understandable passion for clear air, open space, colorful dreams, panoramic vistas, and the breath-taking prospects and challenges of the larger world. Consequently, the Camus of the period 1937-38 is a decidedly different writer from the Camus who will ascend the dais at Stockholm nearly twenty years later.

The young Camus is more of a sensualist and pleasure-seeker, more of a dandy and aesthete, than the more hardened and austere figure who will endure the Occupation while serving in the French underground. He is a writer passionate in his conviction that life ought to be lived vividly and intensely—indeed rebelliously (to use the term that will take on increasing importance in his thought). He is also a writer attracted to causes, though he is not yet the author who will become world-famous for his moral seriousness and passionate commitment to justice and freedom. All of which is understandable. After all, the Camus of the middle 1930s had not yet witnessed and absorbed the shattering spectacle and disillusioning effects of the Spanish Civil War, the rise of Fascism, Hitlerism, and Stalinism, the coming into being of total war and weapons of mass destruction, and the terrible reign of genocide and terror that would characterize the period 1938-1945. It was under the pressure and in direct response to the events of this period that Camus’s mature philosophy—with its core set of humanistic themes and ideas—emerged and gradually took shape. That mature philosophy is no longer a “naïve atheism” but a very reflective and critical brand of unbelief. It is proudly and inconsolably pessimistic, but not in a polemical or overbearing way. It is unbending, hardheaded, determinedly skeptical. It is tolerant and respectful of world religious creeds, but at the same time wholly unsympathetic to them. In the end it is an affirmative philosophy that accepts and approves, and in its own way blesses, our dreadful mortality and our fundamental isolation in the world.

c. Themes and Ideas

Regardless of whether he is producing drama, fiction, or non-fiction, Camus in his mature writings nearly always takes up and re-explores the same basic philosophical issues. These recurrent topoi constitute the key components of his thought. They include themes like the Absurd, alienation, suicide, and rebellion that almost automatically come to mind whenever his name is mentioned. Hence any summary of his place in modern philosophy would be incomplete without at least a brief discussion of these ideas and how they fit together to form a distinctive and original world-view.

i. The Absurd

Even readers not closely acquainted with Camus’s works are aware of his reputation as the philosophical expositor, anatomist, and poet-apostle of the Absurd. Indeed, as even sitcom writers and stand-up comics apparently understand (odd fact: the comic-bleak final episode of Seinfeld has been compared to The Stranger, and Camus’s thought has been used to explain episodes of The Simpsons), it is largely through the thought and writings of the French-Algerian author that the concept of absurdity has become a part not only of world literature and twentieth-century philosophy but also of modern popular culture.

What then is meant by the notion of the Absurd? Contrary to the view conveyed by popular culture, the Absurd, (at least in Camus’ terms) does not simply refer to some vague perception that modern life is fraught with paradoxes, incongruities, and intellectual confusion. (Although that perception is certainly consistent with his formula.) Instead, as he himself emphasizes and tries to make clear, the Absurd expresses a fundamental disharmony, a tragic incompatibility, in our existence. In effect, he argues that the Absurd is the product of a collision or confrontation between our human desire for order, meaning, and purpose in life and the blank, indifferent “silence of the universe.” (“The absurd is not in man nor in the world,” Camus explains, “but in their presence together . . . it is the only bond uniting them.”)

So here we are: poor creatures desperately seeking hope and meaning in a hopeless, meaningless world. Sartre, in his essay-review of The Stranger provides an additional gloss on the idea: “The absurd, to be sure, resides neither in man nor in the world, if you consider each separately. But since man’s dominant characteristic is ‘being in the world,’ the absurd is, in the end, an inseparable part of the human condition.” The Absurd, then, presents itself in the form of an existential opposition. It arises from the human demand for clarity and transcendence on the one hand and a cosmos that offers nothing of the kind on the other. Such is our fate: we inhabit a world that is indifferent to our sufferings and deaf to our protests.

In Camus’s view there are three possible philosophical responses to this predicament. Two of these he condemns as evasions, and the other he puts forward as a proper solution.

The first choice is blunt and simple: physical suicide. If we decide that a life without some essential purpose or meaning is not worth living, we can simply choose to kill ourselves. Camus rejects this choice as cowardly. In his terms it is a repudiation or renunciation of life, not a true revolt.

The second choice is the religious solution of positing a transcendent world of solace and meaning beyond the Absurd. Camus calls this solution “philosophical suicide” and rejects it as transparently evasive and fraudulent. To adopt a supernatural solution to the problem of the Absurd (for example, through some type of mysticism or leap of faith) is to annihilate reason, which in Camus’s view is as fatal and self-destructive as physical suicide. In effect, instead of removing himself from the absurd confrontation of self and world like the physical suicide, the religious believer simply removes the offending world and replaces it, via a kind of metaphysical abracadabra, with a more agreeable alternative.

The third choice—in Camus’s view the only authentic and valid solution—is simply to accept absurdity, or better yet to embrace it, and to continue living. Since the Absurd in his view is an unavoidable, indeed defining, characteristic of the human condition, the only proper response to it is full, unflinching, courageous acceptance. Life, he says, can “be lived all the better if it has no meaning.”

The example par excellence of this option of spiritual courage and metaphysical revolt is the mythical Sisyphus of Camus’s philosophical essay. Doomed to eternal labor at his rock, fully conscious of the essential hopelessness of his plight, Sisyphus nevertheless pushes on. In doing so he becomes for Camus a superb icon of the spirit of revolt and of the human condition. To rise each day to fight a battle you know you cannot win, and to do this with wit, grace, compassion for others, and even a sense of mission, is to face the Absurd in a spirit of true heroism.

Over the course of his career, Camus examines the Absurd from multiple perspectives and through the eyes of many different characters—from the mad Caligula, who is obsessed with the problem, to the strangely aloof and yet simultaneously self-absorbed Meursault, who seems indifferent to it even as he exemplifies and is finally victimized by it. In The Myth of Sisyphus, Camus traces it in specific characters of legend and literature (Don Juan, Ivan Karamazov) and also in certain character types (the Actor, the Conqueror), all of who may be understood as in some way a version or manifestation of Sisyphus, the archetypal absurd hero.

[Note: A rather different, yet possibly related, notion of the Absurd is proposed and analyzed in the work of Kierkegaard, especially in Fear and Trembling and Repetition. For Kierkegaard, however, the Absurd describes not an essential and universal human condition, but the special condition and nature of religious faith—a paradoxical state in which matters of will and perception that are objectively impossible can nevertheless be ultimately true. Though it is hard to say whether Camus had Kierkegaard particularly in mind when he developed his own concept of the absurd, there can be little doubt that Kierkegaard’s knight of faith is in certain ways an important predecessor of Camus’s Sisyphus: both figures are involved in impossible and endlessly agonizing tasks, which they nevertheless confidently and even cheerfully pursue. In the knight’s quixotic defiance and solipsism, Camus found a model for his own ideal of heroic affirmation and philosophical revolt.]

ii. Revolt

The companion theme to the Absurd in Camus’s oeuvre (and the only other philosophical topic to which he devoted an entire book) is the idea of Revolt. What is revolt? Simply defined, it is the Sisyphean spirit of defiance in the face of the Absurd. More technically and less metaphorically, it is a spirit of opposition against any perceived unfairness, oppression, or indignity in the human condition.

Rebellion in Camus’s sense begins with a recognition of boundaries, of limits that define one’s essential selfhood and core sense of being and thus must not be infringed—as when a slave stands up to his master and says in effect “thus far, and no further, shall I be commanded.” This defining of the self as at some point inviolable appears to be an act of pure egoism and individualism, but it is not. In fact Camus argues at considerable length to show that an act of conscientious revolt is ultimately far more than just an individual gesture or an act of solitary protest. The rebel, he writes, holds that there is a “common good more important than his own destiny” and that there are “rights more important than himself.” He acts “in the name of certain values which are still indeterminate but which he feels are common to himself and to all men” (The Rebel 15-16).

Camus then goes on to assert that an “analysis of rebellion leads at least to the suspicion that, contrary to the postulates of contemporary thought, a human nature does exist, as the Greeks believed.” After all, “Why rebel,” he asks, “if there is nothing permanent in the self worth preserving?” The slave who stands up and asserts himself actually does so for “the sake of everyone in the world.” He declares in effect that “all men—even the man who insults and oppresses him—have a natural community.” Here we may note that the idea that there may indeed be an essential human nature is actually more than a “suspicion” as far as Camus himself was concerned. Indeed for him it was more like a fundamental article of his humanist faith. In any case it represents one of the core principles of his ethics and is one of the tenets that sets his philosophy apart from existentialism.

True revolt, then, is performed not just for the self but also in solidarity with and out of compassion for others. And for this reason, Camus is led to conclude that revolt too has its limits. If it begins with and necessarily involves a recognition of human community and a common human dignity, it cannot, without betraying its own true character, treat others as if they were lacking in that dignity or not a part of that community. In the end it is remarkable, and indeed surprising, how closely Camus’s philosophy of revolt, despite the author’s fervent atheism and individualism, echoes Kantian ethics with its prohibition against treating human beings as means and its ideal of the human community as a kingdom of ends.

iii. The Outsider

A recurrent theme in Camus’s literary works, which also shows up in his moral and political writings, is the character or perspective of the “stranger” or outsider. Meursault, the laconic narrator of The Stranger, is the most obvious example. He seems to observe everything, even his own behavior, from an outside perspective. Like an anthropologist, he records his observations with clinical detachment at the same time that he is warily observed by the community around him.

Camus came by this perspective naturally. As a European in Africa, an African in Europe, an infidel among Muslims, a lapsed Catholic, a Communist Party drop-out, an underground resister (who at times had to use code names and false identities), a “child of the state” raised by a widowed mother (who was illiterate and virtually deaf and dumb), Camus lived most of his life in various groups and communities without really being integrated within them. This outside view, the perspective of the exile, became his characteristic stance as a writer. It explains both the cool, objective (“zero-degree”) precision of much of his work and also the high value he assigned to longed-for ideals of friendship, community, solidarity, and brotherhood.

iv. Guilt and Innocence

Throughout his writing career, Camus showed a deep interest in questions of guilt and innocence. Once again Meursault in The Stranger provides a striking example. Is he legally innocent of the murder he is charged with? Or is he technically guilty? On the one hand, there seems to have been no conscious intention behind his action. Indeed the killing takes place almost as if by accident, with Meursault in a kind of absent-minded daze, distracted by the sun. From this point of view, his crime seems surreal and his trial and subsequent conviction a travesty. On the other hand, it is hard for the reader not to share the view of other characters in the novel, especially Meursault’s accusers, witnesses, and jury, in whose eyes he seems to be a seriously defective human being—at best, a kind of hollow man and at worst, a monster of self-centeredness and insularity. That the character has evoked such a wide range of responses from critics and readers—from sympathy to horror—is a tribute to the psychological complexity and subtlety of Camus’s portrait.

Camus’s brilliantly crafted final novel, The Fall, continues his keen interest in the theme of guilt, this time via a narrator who is virtually obsessed with it. The significantly named Jean-Baptiste Clamence (a voice in the wilderness calling for clemency and forgiveness) is tortured by guilt in the wake of a seemingly casual incident. While strolling home one drizzly November evening, he shows little concern and almost no emotional reaction at all to the suicidal plunge of a young woman into the Seine. But afterwards the incident begins to gnaw at him, and eventually he comes to view his inaction as typical of a long pattern of personal vanity and as a colossal failure of human sympathy on his part. Wracked by remorse and self-loathing, he gradually descends into a figurative hell. Formerly an attorney, he is now a self-described “judge-penitent” (a combination sinner, tempter, prosecutor, and father-confessor) who shows up each night at his local haunt, a sailor’s bar near Amsterdam’s red light district, where, somewhat in the manner of Coleridge’s Ancient Mariner, he recounts his story to whoever will hear it. In the final sections of the novel, amid distinctly Christian imagery and symbolism, he declares his crucial insight that, despite our pretensions to righteousness, we are all guilty. Hence no human being has the right to pass final moral judgment on another.

In a final twist, Clamence asserts that his acid self-portrait is also a mirror for his contemporaries. Hence his confession is also an accusation—not only of his nameless companion (who serves as the mute auditor for his monologue) but ultimately of the hypocrite lecteur as well.

v. Christianity vs. “Paganism”

The theme of guilt and innocence in Camus’s writings relates closely to another recurrent tension in his thought: the opposition of Christian and pagan ideas and influences. At heart a nature-worshipper, and by instinct a skeptic and non-believer, Camus nevertheless retained a lifelong interest and respect for Christian philosophy and literature. In particular, he seems to have recognized St. Augustine and Kierkegaard as intellectual kinsmen and writers with whom he shared a common passion for controversy, literary flourish, self-scrutiny, and self-dramatization. Christian images, symbols, and allusions abound in all his work (probably more so than in the writing of any other avowed atheist in modern literature), and Christian themes—judgment, forgiveness, despair, sacrifice, passion, and so forth—permeate the novels. (Meursault and Clamence, it is worth noting, are presented not just as sinners, devils, and outcasts, but in several instances explicitly, and not entirely ironically, as Christ figures.)

Meanwhile alongside and against this leitmotif of Christian images and themes, Camus sets the main components of his essentially pagan worldview. Like Nietzsche, he maintains a special admiration for Greek heroic values and pessimism and for classical virtues like courage and honor. What might be termed Romantic values also merit particular esteem within his philosophy: passion, absorption in pure being, an appreciation for and indeed a willingness to revel in raw sensory experience, the glory of the moment, the beauty of the world.

As a result of this duality of influence, Camus’s basic philosophical problem becomes how to reconcile his Augustinian sense of original sin (universal guilt) and rampant moral evil with his personal ideal of pagan primitivism (universal innocence) and with his conviction that the natural world and our life in it have intrinsic beauty and value. Can an absurd world have intrinsic value? Is authentic pessimism compatible with the view that there is an essential dignity to human life? Such questions raise the possibility that there may be deep logical inconsistencies within Camus’s philosophy, and some critics (notably Sartre) have suggested that these inconsistencies cannot be surmounted except through some sort of Kierkegaardian leap of faith on Camus’s part—in this case a leap leading to a belief not in God but in man.

Such a leap is certainly implied in an oft-quoted remark from Camus’s “Letter to a German Friend,” where he wrote: “I continue to believe that this world has no supernatural meaning…But I know that something in the world has meaning—man.” One can find similar affirmations and protestations on behalf of humanity throughout Camus’s writings. They are almost a hallmark of his philosophical style. Oracular and high-flown, they clearly have more rhetorical force than logical potency. On the other hand, if we are trying to locate Camus’s place in European philosophical tradition, they provide a strong clue as to where he properly belongs. Surprisingly, the sentiment here, a commonplace of the Enlightenment and of traditional liberalism, is much closer in spirit to the exuberant secular humanism of the Italian Renaissance than to the agnostic skepticism of contemporary post-modernism.

vi. Individual vs. History and Mass Culture

A primary theme of early twentieth-century European literature and critical thought is the rise of modern mass civilization and its suffocating effects of alienation and dehumanization. This became a pervasive theme by the time Camus was establishing his literary reputation. Anxiety over the fate of Western culture, already intense, escalated to apocalyptic levels with the sudden emergence of fascism, totalitarianism, and new technologies of coercion and death. Here then was a subject ready-made for a writer of Camus’s political and humanistic views. He responded to the occasion with typical force and eloquence.

In one way or another, the themes of alienation and dehumanization as by-products of an increasingly technical and automated world enter into nearly all of Camus’s works. Even his concept of the Absurd becomes multiplied by a social and economic world in which meaningless routines and mind-numbing repetitions predominate. The drudgery of Sisyphus is mirrored and amplified in the assembly line, the business office, the government bureau, and especially in the penal colony and concentration camp.

In line with this theme, the ever-ambiguous Meursault in The Stranger can be understood as both a depressing manifestation of the newly emerging mass personality (that is, as a figure devoid of basic human feelings and passions) and, conversely, as a lone hold-out, a last remaining specimen of the old Romanticism—and hence a figure who is viewed as both dangerous and alien by the robotic majority. Similarly, The Plague can be interpreted, on at least one level, as an allegory in which humanity must be preserved from the fatal pestilence of mass culture, which converts formerly free, autonomous, independent-minded human beings into a soulless new species.

At various times in the novel, Camus’s narrator describes the plague as if it were a dull but highly capable public official or bureaucrat:

It was, above all, a shrewd, unflagging adversary; a skilled organizer, doing his work thoroughly and well. (180) “But it seemed the plague had settled in for good at its most virulent, and it took its daily toll of deaths with the punctual zeal of a good civil servant.” (235)

 This identification of the plague with oppressive civil bureaucracy and the routinization of charisma looks forward to the author’s play The State of Siege, where plague is used once again as a symbol for totalitarianism—only this time it is personified in an almost cartoonish way as a kind of overbearing government functionary or office manager from hell. Clad in a gaudy military uniform bedecked with ribbons and decorations, the character Plague (a satirical portrait of Generalissimo Francisco Franco—or El Caudillo as he liked to style himself) is closely attended by his personal Secretary and loyal assistant Death, depicted as a prim, officious female bureaucrat who also favors military garb and who carries an ever-present clipboard and notebook.

So Plague is a fascist dictator, and Death a solicitous commissar. Together these figures represent a system of pervasive control and micro-management that threatens the future of mass society.

In his reflections on this theme of post-industrial dehumanization, Camus differs from most other European writers (and especially from those on the Left) in viewing mass reform and revolutionary movements, including Marxism, as representing at least as great a threat to individual freedom as late-stage capitalism. Throughout his career he continued to cherish and defend old-fashioned virtues like personal courage and honor that other Left-wing intellectuals tended to view as reactionary or bourgeois.

vii. Suicide

Suicide is the central subject of The Myth of Sisyphus and serves as a background theme in Caligula and The Fall. In Caligula the mad title character, in a fit of horror and revulsion at the meaninglessness of life, would rather die—and bring the world down with him—than accept a cosmos that is indifferent to human fate or that will not submit to his individual will. In The Fall, a stranger’s act of suicide serves as the starting point for a bitter ritual of self-scrutiny and remorse on the part of the narrator.

Like Wittgenstein (who had a family history of suicide and suffered from bouts of depression), Camus considered suicide the fundamental issue for moral philosophy. However, unlike other philosophers who have written on the subject (from Cicero and Seneca to Montaigne and Schopenhauer), Camus seems uninterested in assessing the traditional motives and justifications for suicide (for instance, to avoid a long, painful, and debilitating illness or as a response to personal tragedy or scandal). Indeed, he seems interested in the problem only to the extent that it represents one possible response to the Absurd. His verdict on the matter is unqualified and clear: The only courageous and morally valid response to the Absurd is to continue living—“Suicide is not an option.”

viii. The Death Penalty

From the time he first heard the story of his father’s literal nausea and revulsion after witnessing a public execution, Camus began a vocal and lifelong opposition to the death penalty. Executions by guillotine were a common public spectacle in Algeria during his lifetime, but he refused to attend them and recoiled bitterly at their very mention.

Condemnation of capital punishment is both explicit and implicit in his writings. For example, in The Stranger Meursault’s long confinement during his trial and his eventual execution are presented as part of an elaborate, ceremonial ritual involving both public and religious authorities. The grim rationality of this process of legalized murder contrasts markedly with the sudden, irrational, almost accidental nature of his actual crime. Similarly, in The Myth of Sisyphus, the would-be suicide is contrasted with his fatal opposite, the man condemned to death, and we are continually reminded that a sentence of death is our common fate in an absurd universe.

Camus’s opposition to the death penalty is not specifically philosophical. That is, it is not based on a particular moral theory or principle (such as Cesare Beccaria’s utilitarian objection that capital punishment is wrong because it has not been proven to have a deterrent effect greater than life imprisonment). Camus’s opposition, in contrast, is humanitarian, conscientious, almost visceral. Like Victor Hugo, his great predecessor on this issue, he views the death penalty as an egregious barbarism—an act of blood riot and vengeance covered over with a thin veneer of law and civility to make it acceptable to modern sensibilities. That it is also an act of vengeance aimed primarily at the poor and oppressed, and that it is given religious sanction, makes it even more hideous and indefensible in his view.

Camus’s essay “Reflections on the Guillotine” supplies a detailed examination of the issue. An eloquent personal statement with compelling psychological and philosophical insights, it includes the author’s direct rebuttal to traditional retributionist arguments in favor of capital punishment (such as Kant’s claim that death is the legally appropriate, indeed morally required, penalty for murder). To all who argue that murder must be punished in kind, Camus replies:

Capital punishment is the most premeditated of murders, to which no criminal’s deed, however calculated, can be compared. For there to be an equivalency, the death penalty would have to punish a criminal who had warned his victim of the date on which he would inflict a horrible death on him and who, from that moment onward, had confined him at his mercy for months. Such a monster is not to be encountered in private life.

Camus concludes his essay by arguing that, at the very least, France should abolish the savage spectacle of the guillotine and replace it with a more humane procedure (such as lethal injection). But he still retains a scant hope that capital punishment will be completely abolished at some point in the time to come: “In the unified Europe of the future the solemn abolition of the death penalty ought to be the first article of the European Code we all hope for.” Camus himself did not live to see the day, but he would no doubt be gratified to know that abolition of capital punishment is now an essential prerequisite for membership in the European Union.

6. Existentialism

Camus is often classified as an existentialist writer, and it is easy to see why. Affinities with Kierkegaard and Sartre are patent. He shares with these philosophers (and with the other major writers in the existentialist tradition, from Augustine and Pascal to Dostoyevsky and Nietzsche) an habitual and intense interest in the active human psyche, in the life of conscience or spirit as it is actually experienced and lived. Like these writers, he aims at nothing less than a thorough, candid exegesis of the human condition, and like them he exhibits not just a philosophical attraction but also a personal commitment to such values as individualism, free choice, inner strength, authenticity, personal responsibility, and self-determination.

However, one troublesome fact remains: throughout his career Camus repeatedly denied that he was an existentialist. Was this an accurate and honest self-assessment? On the one hand, some critics have questioned this “denial” (using the term almost in its modern clinical sense), attributing it to the celebrated Sartre-Camus political “feud” or to a certain stubbornness or even contrariness on Camus’s part. In their view, Camus qualifies as, at minimum, a closet existentialist, and in certain respects (e.g., in his unconditional and passionate concern for the individual) as an even truer specimen of the type than Sartre.

On the other hand, besides his personal rejection of the label, there appear to be solid reasons for challenging the claim that Camus is an existentialist. For one thing, it is noteworthy that he never showed much interest in (indeed he largely avoided) metaphysical and ontological questions (the philosophical raison d’etre of Heidegger and Sartre). Of course there is no rule that says an existentialist must be a metaphysician. However, Camus’s seeming aversion to technical philosophical discussion does suggest one way in which he distanced himself from contemporary existentialist thought.

Another point of divergence is that Camus seems to have regarded existentialism as a complete and systematic world-view, that is, a fully articulated doctrine. In his view, to be a true existentialist one had to commit to the entire doctrine (and not merely to bits and pieces of it), and this was apparently something he was unwilling to do.

A further point of separation, and possibly a decisive one, is that Camus actively challenged and set himself apart from the existentialist motto that being precedes essence. Ultimately, against Sartre in particular and existentialists in general, he clings to his instinctive belief in a common human nature. In his view human existence necessarily includes an essential core element of dignity and value, and in this respect he seems surprisingly closer to the humanist tradition from Aristotle to Kant than to the modern tradition of skepticism and relativism from Nietzsche to Derrida (the latter his fellow-countryman and, at least in his commitment to human rights and opposition to the death penalty, his spiritual successor and descendant).

7. Camus, Colonialism, and Algeria

One of the main topics and even preoccupations of recent Camus studies has been the writer’s attitude, as reflected in both his fiction and in his non-fiction, towards European colonialism in general and his response to the French-Algerian “problem” or “question” (as it was often termed) in particular. The first thing that can be noted in this respect is that, unlike Sartre and many other European intellectuals, Camus never delivered a formal critique of colonialism. Nor did he sign any of the frequent manifestos and declarations deploring the practice – a sin for which he was sharply criticized and even accused of moral cowardice. In 1958, partly to explain and vindicate himself, but mainly to illustrate and give voice to the painful complexities of colonial reform and decolonization, he published Algerian Chronicles, a collection of his writings on the vexing “problem” that he had personally agonized over for more than twenty years.

In addition to his perceived silence on the issue of colonialism (a silence, as Algerian Chronicles reveals, motivated by his fear that speaking out aggressively would be more likely to heighten tensions than secure the united and independent post-colonial Algeria he hoped for), Camus has also been criticized for the virtual erasure of Arab characters and culture from his fiction. The Irish writer and politician Conor Cruise O’Brien made a partial attempt to rescue Camus from this criticism by arguing that The Fall should be read as an autobiographical work in which Camus confesses his own personal failures, including his guilt at becoming a privileged citizen in a poor country. Several writers, and most prominently and forcefully Edward Said, have denounced the nearly total absence of Arab characters in Camus’s novels and stories. Moreover, the few Arab characters who do appear, these critics point out, are inevitably mute and anonymous. They are either shadow figures, including the nameless murder victim at the climactic center of The Stranger, or mere bodies, like the uncounted and unidentified native Algerians who presumably make up the major part of the death toll in The Plague but who otherwise have no speaking role or even visible presence in the novel. Along this same line of criticism, The Meursault Investigation is a fictional and metafictional riposte to Camus by the Algerian writer Kamel Daoud. A reimagining of the characters and events of The Stranger, told from the point of view of the brother of the murdered Arab, the novel represents both a corrective rebuke and a literary tribute to it famous original. In the introduction to her recent expanded edition of Algerian Chronicles, Alice Kaplan addresses these and related criticisms and cites relevant passages from Camus’s own writing in response to them.

8. Significance and Legacy

Obviously, Camus’s writings remain the primary reason for his continuing importance and the chief source of his cultural legacy, but his fame is also due to his exemplary life. He truly lived his philosophy; thus it is in his personal political stands and public statements as well as in his books that his views are clearly articulated. In short, he bequeathed not just his words but also his actions. Taken together, those words and actions embody a core set of liberal democratic values—including tolerance, justice, liberty, open-mindedness, respect for personhood, condemnation of violence, and resistance to tyranny—that can be fully approved and acted upon by the modern intellectual engagé.

On a purely literary level, one of Camus’s most original contributions to modern discourse is his distinctive prose style. Terse and hard-boiled, yet at the same time lyrical, and indeed capable of great, soaring flights of emotion and feeling, Camus’s style represents a deliberate attempt on his part to wed the famous clarity, elegance, and dry precision of the French philosophical tradition with the more sonorous and opulent manner of 19th century Romantic fiction. The result is something like a cross between Hemingway (a Camus favorite) and Melville (another favorite) or between Diderot and Hugo. For the most part when we read Camus we encounter the plain syntax, simple vocabulary, and biting aphorism typical of modern theatre or noir detective fiction. By the way it’s worth noting that Camus was a fan of the novels of Dashiell Hammett and James M Cain and that his own work has influenced the style and the existentialist loner heroes of a succession of later crime writers, including John D McDonald and Lee Child. This muted, laconic style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Moreover, this base style frequently becomes a counterpoint or springboard for extended musings and lavish descriptions almost in the manner of Proust. Here we may note that this attempted reconciliation or union of opposing styles is not just an aesthetic gesture on the author’s part: It is also a moral and political statement. It says, in effect, that the life of reason and the life of feeling need not be opposed; that intellect and passion can, and should, operate together.

Perhaps the greatest inspiration and example that Camus provides for contemporary readers is the lesson that it is still possible for a serious thinker to face the modern world (with a full understanding of its contradictions, injustices, brutal flaws, and absurdities) with hardly a grain of hope, yet utterly without cynicism. To read Camus is to find words like justice, freedom, humanity, and dignity used plainly and openly, without apology or embarrassment, and without the pained or derisive facial expressions or invisible quotation marks that almost automatically accompany those terms in public discourse today.

At Stockholm Camus concluded his Nobel acceptance speech with a stirring reminder and challenge to modern writers: “The nobility of our craft,” he declared, “will always be rooted in two commitments, both difficult to maintain: the refusal to lie about what one knows and the resistance to oppression.” He left behind a body of work faithful to his own credo that the arts of language must always be used in the service of truth and the service of liberty.

9. References and Further Reading

a. Works by Albert Camus

  • The Stranger. Trans. Stuart Gilbert. New York: Vintage-Random House, 1946.
  • Camus’s first novel, a classic portrait of the “outsider” originally published in France as L’Etranger by Librairie Gallimard in 1942.
  • The Plague. Trans. Stuart Gilbert. New York: Vintage-International, 1991.
  • Camus’s second novel, originally published in France as La Peste by Librairie Gallimard in 1947.
  • The Fall. Trans. Justin O’Brien. New York: Vintage-Random House, 1956.
  • Camus’s third novel, a confessional monologue originally published in France as La Chute by Librairie Gallimard in 1956.
  • The Myth of Sisyphus and other Essays. Trans. Justin O’Brien. New York: Vintage-Random House, 1955.
  • A philosophical meditation on suicide originally published as Le Mythe de Sisyphe by Librairie Gallimard in 1942.
  • The Rebel. Trans. Anthony Bower. New York: Vintage-Random House, 1956.
  • A philosophical essay on the ethics of rebellion and political violence originally published as L’Homme Revolte by Librairie Gallimard in 1951.
  • Exile and the Kingdom. Trans. Justin O’Brien.  New York: Vintage-Random House, 1958.
  • A collection of short fiction originally published as L’Exil et le Royaume by Librairie Gallimard in 1957.
  • Lyrical and Critical Essays. Ed. Philip Thody. Trans. Ellen Conroy Kennedy. New York: Vintage-Random House, 1970.
  • A selection of critical writings, including essays on Melville, Faulkner, and Sartre, plus all the early essays from Betwixt and Between and Nuptials.
  • Resistance, Rebellion, and Death. Trans. Justin O’Brien. New York: Vintage International, 1995.
  • A collection of essays on a wide variety of political topics ranging from the death penalty to the Cold War.
  • Caligula and Three Other Plays. Trans. Stuart Gilbert. New York: Vintage-Random House, 1958.
  • A collection of four of Camus’s best-known dramatic works: Caligula, The Misunderstanding, The State of Siege, and The Just Assassins, with a foreword by the author.
  • The First Man. Trans. David Hapgood. New York: Alfred Knopf, 1995.
  • A posthumous novel, partly autobiographical.
  • Camus at Combat: Writings 1944-1947. Ed. Jaqueline Levi-Valenci. Trans. Arthur Goldhammer. Princeton, NJ: Princeton University Press, 2006.
  • A collection of articles and editorials that Camus wrote during and after WW II for the French Resistance journal Combat.
  • Algerian Chronicles. Ed. Alice Kaplan. Trans. Arthur Goldhammer. Cambridge, MA: Belknap Press, 2013.
  • A collection of Camus’s political writings on Algeria.

b. Critical and Biographical Studies

  • Barthes, Roland. Writing Degree Zero. New York: Hill and Wang, 1968.
  • Bloom, Harold, ed. Albert Camus. New York: Chelsea House, 1989.
  • Brée, Germaine. Camus. New Brunswick, NJ: Rutgers University Press, 1961.
  • Brée, Germaine, ed. Camus: A Collection of Critical Essays. Englewood Cliffs, NJ: Prentice-Hall, 1962.
  • Cruickshank, John. Albert Camus and the Literature of Revolt. London: Oxford University Press, 1959.
  • Cruickshank, John. The Novelist as Philosopher. London: Oxford University Press, 1959.
  • Foley, John. Albert Camus: From the Absurd to Revolt. Montreal: McGill-Queens University Press, 2008.
  • Hughes, Edward J. ed. The Cambridge Companion to Camus. Cambridge, UK: Cambridge University Press, 2007.
  • Kauffman, Walter, ed. Religion from Tolstoy to Camus. New York: Harper, 1964.
  • Lottman, Herbert R. Albert Camus: A Biography. Corte Madera, CA: Gingko Press, 1997.
  • Malraux, Andre. Anti-Memoirs. New York: Holt, Rinehart, and Winston, 1968.
  • Margerrison, Christine. et al. Albert Camus in the 21st Century: A Reassessment of his Thinking at the Dawn of the New Millennium. Amsterdam, NL: Rodopi, 2008.
  • McBride, Joseph. Albert Camus: Philosopher and Littérateur. New York: St. Martin’s Press, 1992.
  • O’Brien, Conor Cruise. Camus. London: Faber and Faber, 1970.
  • Said, Edward. “Camus and the French Imperial Experience.” In Culture and Imperialism. New York: Vintage Books, 1994.
  • Sartre, Jean-Paul. “Camus’s The Outsider.” In Situations. New York: George Braziller, 1965.
  • Ronald D Srigley. Albert Camus’ Critique of Modernity. Columbia, MO: University of Missouri Press, 2011.
  • Thrody, Philip. Albert Camus, 1913-1960. London: Hamish Hamilton, 1961.
  • Todd, Olivier. Albert Camus: A Life. New York: Alfred A. Knopf, 1997.
  • Zaretsky, Robert. A Life Worth Living: Albert Camus and the Quest for Meaning. Cambridge, MA: Belknap Press of Harvard University Press, 2013.

 

Author Information

David Simpson
Email: dsimpson@depaul.edu
DePaul University
U. S. A.

Proper Functionalism

‘Proper Functionalism’ refers to a family of epistemological views according to which whether a belief (or some other doxastic state) was formed by way of properly functioning cognitive faculties plays a crucial role in whether it has a certain kind of positive epistemic status (such as being an item of knowledge, or a justified belief). Alvin Plantinga’s proper functionalist theory of knowledge has been the most prominent among these theories. Michael Bergmann’s (2006) proper functionalist theory of justification has also been the focus of much discussion. But proper functionalist theories of other epistemic properties have also been developed. Richard Otte (1987) and Alvin Plantinga (1993b: Chapter 9) offer proper functionalist theories of epistemic probability, for example. Nicholas Wolterstorff (2010) defends a proper functionalist theory of epistemic oughts. And Peter Graham (2010) develops a proper functionalist theory of epistemic entitlement. Since Plantinga’s theory of knowledge and Bergmann’s theory of justification are the most widely known and most discussed proper functionalist views, and because they share many features with other proper functionalist theories, this article focuses primarily on them—what can be said in their favor, the challenges they face, the ways in which they might be defended, and how they compare with some of their closest rivals.

Table of Contents

  1. Plantinga’s Proper Functionalist Theory of Knowledge
    1. Motivations of Plantinga’s Theory
    2. The Content of Plantinga’s Theory
    3. Swampman
    4. Gettier Cases
  2. Bergmann’s Proper Functionalist Theory of Justification
    1. Some Advantages of Bergmann’s Theory
    2. Some Objections to Bergmann’s Theory
  3. Rival Theories
    1. Proper Functionalism and Phenomenal Conservatism
    2. Proper Functionalism and Virtue Epistemology
  4. References and Further Reading

1. Plantinga’s Proper Functionalist Theory of Knowledge

This article begins with a discussion of Alvin Plantinga’s proper functionalist theory of knowledge. As Plantinga himself frames matters, he takes himself to be giving a proper functionalist theory of a property he calls “warrant,” where warrant is whatever precisely it is which makes the difference between knowledge and mere true belief.

a. Motivations of Plantinga’s Theory

A theory of warrant is subject to Gettier-style counterexamples if a belief can meet all the conditions the theory specifies as jointly sufficient for knowledge, but meet them merely by accident (in a manner that precludes that beliefs being an item of knowledge). Plantinga argues that any theory that fails to construe a proper function condition as necessary for warrant is subject to counterexamples of this sort. This is so whether the theory emphasizes the believer’s internal states as most relevant to whether her belief has warrant, external factors, or both of these.

By way of illustration, Plantinga (1993b: 31-37) adopts a scenario originally introduced by Roderick Chisholm, who attributes it to Alexius Meinong. The scenario envisions an aging forest ranger living in the mountains, with a set of wind chimes hanging from a bough. The ranger is unaware of the fact that his hearing has been degenerating of late, and it has gotten to the point where he can no longer hear the chimes. He is also unaware that he is occasionally subject to small auditory hallucinations in which he appears to hear the wind-chime. On one occasion, he is thus appeared to and comes to believe that the wind is blowing. As it happens, the wind is blowing and causing the ringing of the chimes. Even if we stipulate that all is going well with this belief from the ranger’s own internal perspective, it is clear nonetheless that his belief lacks warrant. The reason his belief lacks warrant, Plantinga maintains, results from the fact that it is due to cognitive malfunction.

One might question whether this explanation is correct, however, on the ground that certain cognitively external environmental conditions are also amiss in this case. In particular, the case is one in which there is no reliable connection between the ranger’s appearing to hear the wind-chime and the wind’s blowing. And one might think that it is primarily for this reason that the ranger’s beliefs lack warrant. This thought might push one toward bypassing proper functionalism and endorsing a reliabilist theory of warrant instead (that is, an account according to which a belief having warrant is primarily a matter of it being formed or sustained in a way that involves a reliable connection to the truth). But Plantinga also argues that any reliabilist theory which does not incorporate a proper function condition is also subject to Gettier-style counterexamples.

Plantinga (1993a: 195-198, 205-207) takes this to be illustrated by The Case of the Epistemically Serendipitous Brain Lesion. Imagine Sam has a brain lesion, one that engenders cognitive processes which mostly result in false beliefs. One process the lesion engenders, however, is a process that results in the belief that one has a brain lesion. This particular process is highly reliable (it always results in one’s having a true belief). But clearly the belief that results is not a matter of knowledge. What explains why this is so, Plantinga maintains, is that the belief in question (though formed by a truth-reliable process) is not the result of cognitive proper function. Accordingly, Plantinga concludes that any reliabilist account of warrant must be augmented with a proper function condition.

Kenneth Boyce and Alvin Plantinga (2012: 127-128) have emphasized that there may be an even stronger lesson to be drawn from these cases. Once these cases are on the table, one can imagine variations of them in which different combinations of internal and external conditions (other than proper function ones) are met, but in which the belief in question lacks warrant because it ends up being true merely by accident. Furthermore, Boyce and Plantinga contend that in these cases it seems that part of what explains why these are cases in which the beliefs are true merely by accident (in a way that precludes their being items of knowledge) is that they were not formed in a manner specified by cognitive proper function; that is, the way they get at the truth is accidental from the perspective of the cognitive design plan. If that is correct, however, then (as Boyce and Plantinga point out), there is reason to believe that the notion of cognitive proper function is centrally involved in the notion of non-accidentally that any adequate analysis of warrant must capture.

b. The Content of Plantinga’s Theory

Examples of the sort discussed above are used by Plantinga to motivate the claim that cognitive proper function is necessary for warrant. Plantinga (1993b: 21-24) also maintains that the relevant notion of proper function presupposes that of a design plan—something that specifies the manner in which a thing is supposed to function in various circumstances. As Plantinga conceives of it, a design plan may be modeled as a set of ordered triples, where each triple specifies a circumstance, a response, and a purpose or function. One need not initially take this notion of a design plan to involve conscious design or purpose. The notion of a design plan at issue here is whatever notion is presupposed by talk of proper function for biological systems (as when a physician determines that a human heart is functioning the way it is supposed to on account of its pumping at 70 beats per minute). Plantinga himself gives a theistic account of this notion, but other proper functionalists, such as Ruth Millikan (1984) and Peter Graham (2012), have offered naturalistic, evolutionary accounts.

While Plantinga (1993b: 46) takes cognitive proper function to be necessary for warrant, he does not take it to be sufficient (or even nearly sufficient). Other conditions must also be satisfied. To a rough, first approximation, Plantinga takes a belief to be warranted if and only if it satisfies the following four conditions:

(1) The belief in question is formed by way of cognitive faculties that are properly functioning.

(2) The cognitive faculties in question are aimed at the production of true beliefs.

(3) The design plan is a good one. That is, when a belief is formed by way of truth-aimed cognitive proper function in the sort of environment for which the cognitive faculties in question were designed, there is a high objective probability that the resulting belief is true.

(4) The belief is formed in the sort of environment for which the cognitive faculties in question were designed.

While Plantinga adds various nuances, these four conditions serve to capture the main outlines of his view.

Many objections have been raised to Plantinga’s theory. Two of the most prominent among them are considered below. The first amounts to an objection to the claim that Plantinga’s four conditions are necessary for warrant. The second amounts to an objection to the claim that they are sufficient. For a sampling of other objections, one would do well to examine the collection of essays on Plantinga’s theory of warrant edited by Jonathan L. Kvanvig (1996).

c. Swampman

Some have argued that there are counterexamples to Plantinga’s theory involving beings who have warranted beliefs but who nevertheless fail to exhibit cognitive proper function. The most well-known version of this objection comes from Ernest Sosa (1993), who adapts a scenario originally proposed by Donald Davidson, and uses it against proper functionalism. In that scenario, Davidson is standing next to a swamp when lightning strikes a nearby dead tree, thereby obliterating Davidson. Simultaneously, by sheer accident, the lightning also causes the molecules of the tree to arrange themselves into a perfect duplicate of Davidson as he was at the time of his demise. The Davidson duplicate—this “Swampman”—leaves the swamp, acting and talking as if it were Davidson, having all the same intrinsic properties that Davidson would have had, had he left the swamp without having his unfortunate encounter. According to Sosa, “it … seems logically possible for … Swampman to have warranted beliefs not long after creation if not right away” (p. 54). Yet, not being the product of intentional design, and not having any evolutionary history, it would seem that Swampman has no design plan. And so we have what appears to be a counterexample to proper functionalism.

There are various responses to the Swampman objection. Plantinga (1993c: 206-208) and Graham (2012: 466-467) have each argued, albeit for different reasons, that it is doubtful the Swampman scenario is metaphysically possible. They have also suggested, again for different reasons, that if this scenario is possible, perhaps Swampman can acquire conditions for proper functioning without natural selection or intentional design. See Plantinga (1993c: 78) and Graham (2014). Bergmann (2006: 147-149) has argued that we are intuitively inclined to assign positive epistemic status to Swampman’s beliefs only to the extent we are inclined to think that his beliefs are fitting responses to the inputs he receives. And we are inclined to think that Swampman’s beliefs are fitting, argues Bergmann, only to the extent we are inclined to think of those responses as exhibiting cognitive proper function. Boyce and Plantinga (2012: 130-131) have suggested that since it is merely by accident that Swampman is forming his beliefs reliably, we can think of this case as a Gettier scenario (or at least, relevantly analogous to one), and thereby deny that Swampman’s beliefs have warrant). For a similar response, see (McNabb 2015).

Since then, Kenneth Boyce and Andrew Moon (2015) have argued that the Swampman objection relies on a false intuition concerning the conditions under which the belief of one creature has warrant if the belief of another, similar creature does. According to them, the central intuition that motivates our intuitive reaction to the Swampman case may be stated as follows:

(CI) If a belief B is warranted for a subject S and another subject S* comes to hold B in the same way that S came to hold B in a relevantly similar environment to the one in which S came to hold B, then B is warranted for S*.

They argue that it is CI, in conjunction with the stipulation that Swampman forms his beliefs in the same way that an ordinary human being would (an ordinary human being to whom we would be inclined to attribute knowledge), that explains our tendency to regard Swampman as having warranted beliefs. Boyce and Moon then go on to argue that CI is subject to counterexamples, and that this undercuts the force of the Swampman objection. See Section 3b for further discussion of their argument.

d. Gettier Cases

Plantinga has conceded that his theory, as he originally formulated it, is subject to Gettier-style counterexamples. In 2000, Plantinga formulated this counterexample:

I own a Chevrolet van, drive to Notre Dame on a football Saturday, and unthinkingly park in one of the many places reserved for the football coach. Naturally, his minions tow my van away and, as befits such lèse majesté, destroy it. By a splendid piece of good luck, however, I have won the Varsity Club’s Win-a-Chevrolet-Van contest, although I haven’t yet heard the good news. You ask me what sort of automobile I own; I reply, both honestly and truthfully, “A Chevrolet van.” My belief that I own such a van is true, but ‘just by accident’ (more accurately, it is only by accident that I happen to form a true belief); hence it does not constitute knowledge. All of the non-environmental conditions for warrant, furthermore, are met. It also looks as if the environmental condition is met: after all, isn’t the cognitive environment here on earth and in South Bend just the one for which our faculties were designed?

Clearly Plantinga’s belief (though true) is not an item of knowledge in this case and thus lacks warrant. So Plantinga’s original four conditions are not jointly sufficient for warrant. Something else must be added. But what?

According to Plantinga, what the original account requires is an addition to the environmental condition. More specifically, the problem in the above case is that while the global environment that Plantinga is in is the one for which his faculties were designed, his more local environment is epistemically misleading. So in order to deal with this counterexample, Plantinga proposes adding a resolution condition. This condition involves a distinction between two different kinds of environment, what Plantinga refers to as the “maxi-environment” and what he refers to as the “mini-environment.” The maxi-environment, Plantinga stipulates, is the kind of global environment in which we live here on earth, the kind of environment for which our cognitive faculties were designed (or to which they were adapted). The mini-environment, by contrast, is a much more specific state of affairs, one that includes, for a given exercise of one’s cognitive faculties E resulting in a belief B, all of the epistemically relevant circumstances obtaining when B is formed (though diminished with respect to whether B is true).

Letting ‘MBE’ denote the cognitive mini-environment with respect to B and E (which Plantinga says may contain as large a fragment of the actual world as one likes, up to whether B is true), Plantinga maintains that the needed resolution condition may be stated as follows:

(RC) A belief B produced by an exercise of cognitive powers has warrant sufficient for knowledge only if MBE (the mini-environment with respect to B and E) is favorable for E.

This, of course, raises the question of just what it is for a mini-environment to be “favorable.” Plantinga has, in the past, offered various proposals for what favorableness consists in that he has subsequently admitted to be unsatisfactory. A proposal is found in Boyce and Plantinga (2012: 134). For other proposals, see Crisp (2000) and Chignell (2003).

2. Bergmann’s Proper Functionalist Theory of Justification

Plantinga’s theory of warrant is not the only kind of proper functionalist theory. Proper functionalist theories of other epistemic concepts have also been developed. Noteworthy among these is Michael Bergmann’s proper functionalist theory of epistemic justification. The kind of epistemic justification that Bergmann (2006: 4-5) is interested in is doxastic justification. The having of this property is frequently (though not universally) held to be a necessary condition for a belief being an item of knowledge. In fact, it is often held that a belief having this property, in conjunction with its being non-accidentally true (in a way that rules out Gettier cases), is not only necessary, but also sufficient, for its being an item of knowledge.

A major divide in the literature occurs between those philosophers who are “externalists” about this kind of justification and those who are “internalists” about it. Just how this divide should be characterized is itself a matter of dispute. But for present purposes, we may characterize internalists about justification as being committed (at least) to the view that whether a belief is justified depends entirely on which mental states that belief is based upon (in such a way that necessarily, any two believers who are exactly alike in terms of their mental states and in terms of which of those mental states their beliefs are based upon are also alike in terms of which of their beliefs are justified). Externalists, by contrast, maintain that whether a belief is justified may depend on other factors.

It should be noted, however, that Bergmann (2006: chapter 3) divides up the territory a bit differently, though not in a way that impacts the current discussion. He takes it to be a necessary condition for a view of justification to count as “internalist” that it include an awareness requirement (that is, that it require, in order for a belief to be justified, that the believing subject is actually or potentially aware of some justification-contributor to that belief). The characterization of internalism given here, by contrast, includes no such requirement (and is similar to the characterization of a view of justification that Bergmann calls “mentalism,” one which he takes to be distinct from both externalism and internalism).

As Bergmann (2006: 3-7) points out, it is not always clear that philosophers who appear to dispute the nature of justification are actually disagreeing with one another. That is because it is plausible that epistemologists sometimes use the term ‘justification’ in different ways. He notes, for example, that some epistemologists use this term to pick out a subjective notion, one that it is satisfied by a belief provided that the subject is blameless in holding it. Others, by contrast, he observes, use the term to pick out a more objective notion, one according to which a belief is justified only if it is fitting with respect to the believer’s evidence or other epistemically relevant inputs. It is this objective notion of justification in which Bergmann is interested (see also pp. 111-113). He takes it to be a conceptually open question as whether this kind of justification is necessary for knowledge (though he thinks it is). And he also takes some disputes between self-avowed externalists (like himself) and self-avowed internalists (such as Richard Feldman and Earl Conee) to involve a genuine disagreement concerning the nature of this kind of justification.

Bergmann argues that the right way to analyze this kind of justification is in terms of proper function. More specifically, Bergmann’s (2006: 132-137) theory of epistemic justification takes the first of Plantinga’s three conditions (leaving out the fourth, environmental condition) to be necessary for a belief to be justified. Bergmann also takes the first three of Plantinga’s conditions, in conjunction with the condition that the subject does not take the relevant belief to be defeated, to be sufficient for a belief being justified. The motivations for this view are perhaps best appreciated by looking to its purported advantages.

a. Some Advantages of Bergmann’s Theory

Epistemic justification of the kind Bergmann has in mind has some puzzling features. On the one hand, it involves some notion of truth-aptness. In particular, there would appear to be some important, non-trivial, connection between a belief being justified and it being objectively likely to be true. At the very least, it would be a significant cost for a theory of justification to deny this. But which ways of forming and sustaining beliefs result in a high proportion of true beliefs depends on what sort of environment one is in. Our tending to believe that occluded objects still exist, for example, results in a high proportion of true beliefs in our environment, but it is easy to imagine environments in which this would not be the case. These considerations push in the direction of regarding what makes for epistemic justification a contingent matter, one that depends on the sort of environment one inhabits.

On the other hand, justification is a normative concept, the satisfaction of which does not appear to depend on the sort of environment in which one is located. This aspect of justification is made especially vivid by “The New Evil Demon Problem”, originally put forward by Keith Lehrer and Stewart Cohen (1983), as a problem for reliabilist theories of justification. Consider a population of beings, just like ourselves, who form their beliefs in response to experience in just the ways that we do, but who (unlike us) are victims of a Cartesian demon who renders their belief-forming processes unreliable. From many reliabilist theories of justification, it follows that these beings have far fewer justified beliefs than we do (since most of their beliefs are not formed in a truth-reliable manner). But this seems false. These beings are in an epistemically bad situation, to be sure, but they are still forming their beliefs in ways that are appropriate given their experiences because their beliefs are at least justified.

Bergmann’s theory of epistemic justification nicely combines these puzzling features. First, it accommodates the intuition that inhabitants of a demon world, who are like us, and who form their beliefs in response to experience in the same ways we do, have the same proportion of justified beliefs. For, as Bergmann (2006: 141-143) notes, his theory entails that provided these beings have a cognitive design plan comparable to ours and are properly functioning, many of their beliefs are justified, even though their ways of forming beliefs are, for the most part, unreliable. This analysis also, as Bergmann points out, accommodates the intuition that justification is importantly and non-trivially connected with truth-aptness. For, insofar as the beings living in a demon world fulfill Bergmann’s conditions for justification, the manner in which they form their beliefs would be truth-apt if they were placed in the environment for which their cognitive faculties were designed. Finally, since different design plans may be tailored to different kinds of environments, Bergmann’s theory accommodates the possibility that what makes for justification is a contingent matter, one that depends on the kind of environment for which the creatures at issue are situated.

b. Some Objections to Bergmann’s Theory

Like Plantinga’s theory, Bergmann’s faces the objection that it is subject to counterexamples involving creatures like Swampman. There is no need, though, to rehearse the various responses that might be given to this objection here (since many of them will be the same or similar to those described in Section 1c). As a theory of justification, however, Bergmann’s view also faces other objections, ones which are not (or not as obviously) applicable to a theory of warrant.

Todd. R. Long (2012: 264-265) questions, for example, whether Bergmann’s theory does in fact do a better job than alternative views in handling the New Evil Demon Problem. He grants that Bergmann’s view does accommodate the intuition that demon-world victims with the same design plan as ours do in fact have justified beliefs (in the same proportions that we do). But he notes that Bergmann’s view also entails that the same cannot be said for demon-world victims who are mentally indistinguishable from ourselves but whose ways of forming beliefs run contrary to their design plan. And Long maintains that to deny that beliefs of demon-world victims in the latter situation are justified also runs contrary to our intuitions. Bergmann (2006: 150), however, anticipates an objection like this. He suggests that there is an analogy between Swampman and the demon victims in such a scenario; accordingly, he adapts his reply to the former so as to apply it to the latter.

Another kind of objection to a proper functionalist theory of justification involves cases in which the design plan specifies ways of belief formation that appear to be objectively bad in some way, in spite of the fact that this component of the design plan is successfully aimed at truth. Long (2012) and Tucker (2014b) each present variations of this objection directed specifically against Bergmann’s view. There are also precedents found among objections to Plantinga’s theory of warrant (see for example Feldman 1993: 44). There are at least two kinds of cases of this sort. The following discussion will make reference to cases described by Tucker, who provides examples of each kind.

In the first kind of case, the design plan specifies coming to hold a belief on the basis of what appears to be an objectively bad form of reasoning. Tucker (2014b: 3321-3322) presents a case, for instance, in which a design plan specifies coming to hold a certain belief on the basis of the fallacy of denying the antecedent. As Tucker points out, even though denying the antecedent is, from a logical point of view, an objectively bad form of reasoning, there are circumstances in which reasoning that way is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify forming a belief in that way, under the right conditions. Even so, it is counterintuitive to think that a belief formed by way of committing a logical fallacy could be justified (at least in the absence of having any further basis).

In the second kind of case, the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Tucker (pp. 3318-3319) presents an example, for instance, in which a person comes to believe Gödel’s incompleteness theorem solely on the basis of his belief that his students hate a particular type of beer. Since Gödel’s incompleteness theorem is a necessary truth, there is no question that this belief-forming process is reliable. So there is no reason in principle why a good, truth-aimed design plan could not specify that a belief be formed in this way. Even so, it seems wrong to say that someone could come to be justified in believing Gödel’s incompleteness theorem solely on the basis of that belief.

This is a formidable objection. But there may be things that can be said on the proper functionalist’s behalf. Consider once again the first kind of case, a case that involves coming to hold a belief in the basis of formally bad reasoning. Some things that have been said in defense of reliabilism might also be of use to the proper functionalist here. Alvin Goldman (2002: 146-153), for example, points to research on the part of cognitive psychologists (such as Amos Tversky and Daniel Kahneman) indicating that human beings tend to rely on heuristics when engaged in probabilistic reasoning. As is now well known, these heuristics make people prone to commit elementary probabilistic fallacies. The conclusion that some psychologists have drawn is that these findings indicate that human beings are terrible at probabilistic reasoning. But as Goldman notes, other psychologists have drawn a more optimistic conclusion.

Goldman points to the work of a group of evolutionary psychologists (led by Gerd Gigerenzer, Leda Cosmides, and John Tooby) who argue that, given the limited information and computational power with which organisms must contend, an inference mechanism can be advantageous if it (in Goldman’s words) “often draws accurate conclusions about real-world environments, and does so quickly and with little computational effort” (p. 152). The heuristics humans rely on in probabilistic reasoning, some of these psychologists maintain, are mechanisms of just that sort. If that is the case, then perhaps human beings often do come to hold justified beliefs by way of these mechanisms after all, in spite of the fact that they are formally suspect. And if that is so, then perhaps other kinds of beings might come to form justified beliefs on the basis of kinds of reasoning that (from a purely logical point of view) are formally suspect, but nonetheless reliable in the environments for which their cognitive faculties were designed.

Now consider the second kind of case, the case in which the design plan specifies coming to hold a belief on the basis of an input that intuitively fails to provide any kind of epistemic support for that belief. Why is it exactly (concerning Tucker’s example) that we are inclined to deny that a person’s belief that his students dislike of a particular type of beer could justify the belief that Gödel’s incompleteness theorem is true? Perhaps it is because there does not appear to be any interesting logical connection between the content of the latter belief and the belief on which it is based. But a similar observation concerning the relationship between our sense experiences and the content of our perceptual beliefs is part of what motivates Bergmann’s proper functionalist theory.

As Bergmann (2006: 119) points out, “Thomas Reid emphasized that there does not seem to be any logical connection between our sense experiences and the content of the beliefs based on them.” Bergmann notes, for example, that “the tactile sensations we experience when touching a hard surface seem to have no logical relation to (nor do they resemble) the content of the hardness beliefs they prompt.” Because this is so, Bergmann argues that the evidential support relations that hold between various sensory experiences and the beliefs formed in response to them cannot be explained in terms of necessary connections. But this prompts the question as to what does explain these support relations. Bergmann (2006: 130-131) argues that proper functionalism provides a good answer to this question. The connections are to be explained by way of which belief-forming responses to sensory inputs are specified by the cognitive design plan.

To accept this motivation for proper functionalism is to accept the claim that at least some epistemic support relations hold only contingently. It is also to countenance the possibility that the epistemic support relations that hold for certain cognizers might seem utterly bizarre from the perspective of creatures like us. So, perhaps, for those who do take this motivation on board, the possibility of an agent’s coming to justifiably believe that Gödel’s incompleteness theorem is true solely on the basis of a belief concerning the beer preferences of his students no longer seems so counterintuitive. (See Bergmann (2006: 141) for a similar response to BonJour’s purported counterexamples to externalist views of justification involving reliably formed clairvoyant beliefs).

3. Rival Theories

Proper functionalist theories do not exist in a vacuum. A full appreciation of their merits or demerits requires an investigation into how well they stack up against their rivals. Two kinds of theories in particular that are often put up against proper functionalism—phenomenal conservatism and virtue epistemology. It is sometimes claimed by the proponents of these theories that they satisfy many of the same motivations as proper functionalism, while having fewer costs, as well as other advantages.

a. Proper Functionalism and Phenomenal Conservatism

At least to a first approximation, a phenomenal conservative theory of doxastic justification may be characterized as the view that a belief with the content that p is justified for an agent if it seems to the agent that p, the agent appropriately bases her belief that p on that seeming, and the agent has no defeaters for that belief. (See Phenomenal Conservatism for more details). As noted in Section 2a, proper functionalists about justification point to the apparent contingency of the connection between various experiences and the beliefs they justify as a motivation for their view. Phenomenal conservatives sometimes claim that their view does just as well at accommodating this apparent contingency while preserving the claim that there is a necessary connection between the things that justify our beliefs and the beliefs they justify. For this reason, phenomenal conservatism might be thought to do a better job than proper functionalism in accommodating the New Evil Demon intuition. Some phenomenal conservatives have also contended that it does a better job in accounting for the nature of evidential support.

Tucker (2011: 58-63) presses this point in connection with his objection (discussed in Section 2b) that proper functionalism allows for inputs which intuitively fail to provide any kind of epistemic support for a belief to justify that belief. In the example previously discussed, Tucker pointed to an instance in which a belief served as such an input. But Tucker also supplies examples in which the same seems to be true of the support relations that hold between various sensory experiences and the beliefs they are purported to justify. He notes, for example, that it is counterintuitive to think that a sensory experience associated with seeing a beautiful sunset could justify the belief that Gödel’s incompleteness theorem is true. But a design plan (presumably different from ours) might well specify that this is an appropriate belief-forming process.

Here the proper functionalist might attempt once more to press the Reidian point that in general it appears true that there is no inherent connection between our sensory experiences and the contents of the beliefs based on them. But Tucker (2011: 56-58, 61-63) suggests a way the advocate of phenomenal conservatism could account for the role that sensory experience plays in justifying our beliefs that accommodates this fact. According to Tucker, sensory experience might play a role in the justification of a certain belief by triggering a seeming with the content of that belief, it being a contingent matter which sensations trigger which seemings. Andrew Cullison (2013: 34-37) makes a similar suggestion, noting that just as two different sentences from different languages might well express the same proposition, two different kinds of cognitive apparatus associated with different species might cause seemings of the same content in response to differing kinds of phenomenology. This accommodates the Reidian point while preserving the claim that there is a necessary connection between the things that justify our beliefs (that is, our seeming states) and the beliefs they justify (via the identities of their contents).

Suppose one agrees that a phenomenal conservative view of justification does better than a proper functionalist view on these counts. This of course does not commit one to agreeing that phenomenal conservatism does better than proper functionalism over all. Bergmann (2013) argues, for example, that proper functionalists can accommodate many of the intuitions that motivate phenomenal conservatism, while also doing a better job in accommodating the intuition that some belief formations, downstream from sensory experiences, are objectively fitting responses to those experiences, whereas others are not.

Bergmann notes, for instance, that proper functionalists might adopt a model according to which, for humans (though not necessarily for all cognizers), when all goes well, a belief formed in response to a sensory experience is justified via being based on an intermediate seeming (one that is appropriately caused by the experience). He argues that this model accommodates many of the intuitions to which phenomenal conservatives appeal. But it also, he points out, allows for the possibility that there is an objective mismatch between a belief formed in response to a sensory experience and the nature of that experience, one which prevents the belief in question from being justified, even when the content of that belief matches the content of the intermediate seeming.

Bergmann describes, for example, a case in which a human cognizer, suffering from brain damage, forms the belief that she is holding a hard spherical object, in response to the olfactory sensation she experiences while smelling a lilac bush. Even if it is stipulated that she bases this belief on an intermediate seeming with the same content as her belief, it can still seem that her belief is objectively unfitting (in relation to her experience) and, for that reason, unjustified. A proper functionalist can accommodate this intuition, Bergmann claims, whereas a phenomenal conservative cannot. The proper functionalist can maintain that the reason the cognizer’s belief is objectively unfitting in this case is that, even though it is based on an appropriate intermediate seeming, it is not the appropriate response to the relevant sensation; it is not the belief her design plan specifies should result.

Relatedly, one might think that proper functionalism does better than phenomenal conservatism in accounting for the relation between justification and truth-aptness. A common objection to phenomenal conservative views is that they suffer from a “cognitive penetration” problem. In certain kinds of wishful thinking cases, for example, a seeming state might be caused by a desire; and in some such cases the believer in question will be unaware of this fact, and have no defeater for the belief in question. According to phenomenal conservatives, a belief properly based on such a seeming will still be justified. But to many this seems wrong. One explanation for why this consequence seems wrong is that it threatens to radically undermine the connection between justification and truth. A proper functionalist, by contrast, might maintain that when such cognitively penetrated seemings are produced in human beings, this is due either to cognitive malfunction or to one of the non-truth aimed facets of our cognitive design plan (either of which, according to her view, would render the belief unjustified). See Tucker (2014a) however for an argument that proper functionalists also suffer from cognitive penetration problems.

b. Proper Functionalism and Virtue Epistemology

According to John Greco (1993: 414), “the central idea of virtue epistemology is that, Gettier problems aside, knowledge is true belief which results from one’s cognitive virtues.” Similarly, Sosa (1993: 64) characterizes it as consisting of a family of theories which may be seen as “varieties of a single more fundamental option in epistemology, one which puts the explicative emphasis on truth-conducive intellectual virtues or faculties.”

Virtue epistemology is often thought of as coming in at least two varieties. Virtue responsibilists emphasize character traits—intellectual virtues such as open-mindedness, conscientiousness, perseverance in seeking the truth, an so on. Virtue reliabilists emphasize cognitive faculties, abilities, or competencies. (See Virtue Epistemology for more details). Of these two, it is virtue reliabilism that is most akin to proper functionalism. Accordingly, virtue reliabilism serves as a closer competitor. Or rather, since Greco (1993: 414) and Sosa (1993: 64) have both classified proper functionalism as a version of virtue epistemology, perhaps it should be said that it is the non-proper-functionalist versions thereof which may be seen as close competitors. For ease of exposition, the following discussion will focus on Sosa’s development of such a version.

According to Sosa’s (2015: 10) virtue theory of knowledge, knowledge is “apt belief” where apt belief is “belief that gets it right through competence rather than luck.” More precisely, according to Sosa, an apt belief is a belief that sufficiently manifests an “epistemic competence” (that is, a competence to get at the truth) (p. 9), where “a competence is in turn understood as a disposition to succeed in a given field of aimings, these being performances with an aim, whether the aim be intentional and even conscious, or teleological and functional” (p. 2). Note the similarity to proper functionalism here. Sosa’s epistemic competences are akin to Plantinga’s truth-aimed cognitive faculties. Both involve the property of being aimed at the formation of true beliefs, and both (when all goes well) are exercised in a way that is conducive to the fulfilment of that aim.

One way in which Sosa’s epistemic competencies differ from Plantinga’s truth-aimed cognitive faculties, however, is that the former do not initially seem to presuppose any notion of a design plan. And this might make Sosa’s theory more adept at accommodating things like Swampman scenarios (see the discussion in Section 1c). Indeed, it was Sosa (1993) who made famous that objection to proper functionalism. It might also make Sosa’s view more appealing to those who are both naturalistically inclined and skeptical about the prospects for a naturalistically acceptable account of cognitive proper function.

Proper functionalists have called into question whether Sosa’s view does in fact have these advantages. Plantinga (1993c: 79) has argued, for example, that in order to handle the case of The Case of the Epistemically Serendipitous Brain Lesion (discussed in Section 1a), Sosa’s epistemic virtues must involve competencies or faculties that are subject to proper function conditions. If that is right, then, as Plantinga (p. 81) points out, Sosa’s view (developed so as not to be subject to this counterexample) becomes a variety of proper functionalism. It should be noted however that virtue epistemologists may have other ways of dealing with this case. John Greco (2010: 152) has suggested, for instance, that “in the brain lesion case, the problem is not so much a lack of health as it is a lack of cognitive integration.” “The cognitive processes associated with the brain lesion,” claims Greco, “are not sufficiently integrated with other of the person’s cognitive dispositions so as to count as being part of cognitive character.” Whether this reply is successful may turn on just what is necessary for a cognitive process to exhibit the kind of cognitive integration required. Greco (2010: 152) suggests his own, non-proper-functionalist criteria. But it is open to proper functionalists to argue that part of what is required is incorporation into one’s cognitive design plan.

Since then, Boyce and Moon (2015) argued that there are other kinds of cases that pose a challenge to the claim that a true belief manifesting a competence is sufficient for its being an item of knowledge. As noted in Section 1c, Boyce and Moon propose a counterexample to what they regard as the central intuition underlying the Swampman Objection to proper functionalism. Their counterexample employs some of the cognitive science literature on initial knowledge, which supports the claim that human beings sometimes come to know things by way of innate, unlearned cognitive responses (see for example Spelke, 1994). Drawing from this literature (as well as from Bergmann, 2006:116-121), Boyce and Moon argue that some of these innate responses are merely contingently appropriate ways of forming beliefs (where the appropriateness at issue is of a kind necessary for warrant). They argue that while these responses are appropriate for human beings, given the kind environments to which humans are adapted, the same need not have been true for other kinds of beings.

Boyce and Moon then go on to argue that these facts entail there are possible cases involving two cognitive agents, who are members of different species, coming to hold the same belief, in the same way, in the same environment, but in which that belief is warranted for one of them (on account of its resulting from a way of forming beliefs that is appropriate for members of that species) but not the other. They further argue that not only do these cases furnish counterexamples to the central intuition motivating the Swampman objection to proper functionalism, but that they also provide a challenge to alternative theories. Boyce and Moon suggest, for instance, that they afford potential counterexamples to Sosa’s theory, at least insofar as it does not recognize factors such as proper function conditions or species membership as relevant to competence possession.

Proper functionalists point to the kinds of cases alluded to above as lending support to the view that a belief having arisen by way of cognitive proper function is necessary for it to count as an item of knowledge. It should be acknowledged, however, that virtue epistemologists have pointed to other kinds of cases in which the opposite seems true. John Greco (2010: 151-153) has noted, for example, that there appear to be “cases of improper function that actually increase a person’s capacity to know.” Greco cites various cases documented by the neurologist Oliver Sacks (1970) in order to illustrate this point. “An obvious example,” says Greco, “is the story of autistic twins, who enjoyed incredible mathematical abilities associated with their autism.” Another case is that of “a man whose illness resulted in an increase in detail and vividness regarding childhood memory.” So much so, Greco notes, that when “these memories were put to use in accurate and detailed paintings of the man’s hometown in Italy…the man came to be considered an expert on the layout and appearance of that town, even though he had not visited there in decades.” Greco claims that these are cases in which “dysfunction gave rise to knowledge.”

What might a proper functionalist say in response to these scenarios? A couple of strategies are suggested by Plantinga (1993c: 74-75) in a reply to Richard Feldman (1993: 48-49). Feldman also points to these kinds of cases as creating difficulties for proper functionalism; in particular, Feldman cites the case of the autistic twins described above. As Feldman notes, these twins had the ability to “just see” (apparently without counting) that the number of matches that had fallen out of a box was 111. In his reply, Plantinga further notes that these same twins could also “just see,” it seems, whether a given six or eight digit number was prime. The first strategy Plantinga suggests for dealing with these cases is to call into question whether the individuals involved really do acquire knowledge in the scenarios described. The second is to concede (at least for the sake of argument) that they do, but argue that this is consistent with proper functionalism.

Regarding the first strategy, Plantinga notes that while the twins mentioned above can in fact reliably identify prime numbers, they lack, according to Sacks, the concept of multiplication. But if the twins lack the concept of multiplication, Plantinga argues, it is not clear that they genuinely grasp the concept of a prime number; so it is not clear that they have the relevant beliefs. Plantinga concedes, however, that this is a less plausible thing to say regarding the twins’ ability to discern the number of matches that had fallen out of a box. Here Plantinga turns to the second strategy. He concedes that while the twins’ “faculties obviously seem to malfunction in some ways,” it is doubtful that they are malfunctioning in producing the belief that there are 111 matches on the floor. Plantinga suggests that, perhaps, the twins have a different design plan than that of other human beings, and that this belief-forming tendency of theirs is subject to proper function conditions. In support of this claim, he notes that it seems possible that this remarkable ability of theirs might become damaged (in such a way that it is no longer reliable); in that case, he contends, we would be inclined to say that this ability had malfunctioned.

Another possibility open to the proper functionalists is to concede that these are cases in which cognitive malfunction enables the acquisition of knowledge, but only by way of truth-aimed proper function. If a typical human being, as a result of cognitive malfunction, suddenly found it seeming to her that she could just see that 111 matches had fallen out of a box, we might doubt that she really knows there are 111 matches. We might think that this belief, formed by way of this new-found tendency of hers, fails to count as knowledge, unless or until she has independent confirmation that the tendency is reliable. Once she does have such confirmation, we might concede that the resulting beliefs do count as knowledge, but only because she learned this to be a reliable way of getting at the truth. So perhaps, in at least some of the cases at issue, the individuals in question do acquire knowledge via belief-forming tendencies resulting from cognitive malfunction, but only by way of having learned those tendencies to be reliable. And if this learning occurs by way of cognitive processes that are in accord with proper function, these cases pose no difficulties for a proper functionalist theory.

This is perhaps not a plausible thing to say regarding all of these cases, however. It is not as plausible a thing to say regarding the individual whose illness caused him to form detailed memorial beliefs pertaining to his hometown in Italy, for example. One reason this a less plausible thing to say concerning that case is that the person in question is (presumably) forming these beliefs in response to memory phenomenology, which is an epistemically appropriate way for human beings to form beliefs downstream from experience. We would be much less likely to judge this person as having knowledge if these same beliefs arose, say, in response to the kind of phenomenology associated with a vivid daydream, unaccompanied by memorial seemings, even if the resulting beliefs should turn out to be reliably formed. So, the proper functionalist might say, if cognitive malfunction is somehow enabling the acquisition of knowledge in this case, it is not by virtue of causing the subject to respond deviantly to his experience (since, in that regard, he is responding as proper function dictates). It must, rather, be by virtue of its causing some deviation upstream from experience (that is, by virtue of its producing an abnormality in the manner in which the subject’s memorial experiences are produced). Whether this creates a significant problem for proper functionalism, furthermore, may depend on just how the malfunction in question enables knowledge.

However exactly memory information is processed, stored, and retrieved so as to generate belief-producing memorial experiences, it is plausible that the cognitive system responsible (or set of systems responsible) has different functions associated with it. One of these functions is to generate experiences that reliably produce true beliefs. But no doubt there are other functions associated with this system that do not pertain to that goal (indeed, some may even be in tension with it). It is plausible, for instance, that some of those functions pertain to filtering information as it comes in, either by preventing some of that information from being stored in the first place, discarding some of that information after it has been stored, or preventing some of it from being encoded in the relevant experiences. The purpose of this filtration process might not be to secure the production of true beliefs, but to prevent various kinds of information overload, or to highlight important items information at the expense of discarding others. Plausibly, what occurs in the case at issue is that a malfunction results in the suppression of these kinds of functions, leaving various other truth-aimed functions associated with the production of the relevant memorial experiences intact.

This consideration suggests yet another possible strategy the proper functionalist might have for dealing with these kinds of cases. Yes, she might grant, some of these are cases in which cognitive malfunction enables knowledge, but not by way of interfering with truth-aimed cognitive proper function (at least not with respect to the process that issued in the relevant beliefs). In at least some of these cases, the malfunction enables knowledge by preventing various non-truth-aimed aspects of cognitive proper function from interfering with or dampening various truth-aimed aspects (or perhaps by preventing some truth-aimed aspects of cognitive proper function from interfering with or dampening various other truth-aimed aspects). The consequence is that certain truth-aimed aspects of cognitive proper function result in various items of knowledge they would not have otherwise produced. So even though these are cases in which cognitive malfunction enables knowledge, the proper functionalist might say, they are not counterexamples to the claim that knowledge itself must come by way of truth-aimed cognitive proper function. Or, she might insist, to the extent to which it is unclear that these purported items of knowledge come by way of truth-aimed cognitive proper function, it is also unclear that we should count them as genuine items of knowledge.

It should pointed out that many virtue theories of knowledge also quite naturally lend themselves to virtue theories of justification. As Sosa (2007: 22-23) points out, for instance, an agent can manifest skill in a performance even when that performance fails to achieve its aim or achieves it merely by luck. An archer might take a skillful shot (to use one of Sosa’s frequent analogies), for instance, while still missing the target (or hitting it only by luck) on account of erratic wind conditions. Similarly, a believer might manifest her skill at coming to hold true beliefs while nonetheless getting it wrong (or getting it right only by luck) on account of being in an epistemically bad environment. Under these circumstances, the belief in question may be said (in Sosa’s terminology) to be “adroit” but not “apt” (p. 23). A belief that is adroit, according to Sosa, may be said to be justified (in one good sense at least) even if it is not an item of knowledge (BonJour and Sosa: 2003: 157).

A Sosa (2015: 26-27) himself is well aware, the having of a skill presupposes something like a normal environment. As Sosa points out, we do not say that a person lacks driving skill merely because she is disposed to perform poorly on an icy road in the midst of a snowstorm. What matters is whether she is disposed to perform well under ordinary driving conditions. Similarly, what matters for whether an agent is skilled at coming to hold true beliefs is whether she is capable of doing so in a certain kind of environment. But which sort of environment is the relevant one? According to Bergmann, this question points to an area in which a proper functionalist theory of justification has the advantage.

As Bergmann (2006: 142-143) notes, in a 1991 work  Sosa takes justification to be relativized to an environment. The person in the demon world has justified beliefs relative to our environment, according to this view, but not relative to her own. Similarly, the beliefs of alien cognizers who have radically different methods of belief formation than we do (ones that are adapted to their own environment) may have beliefs that are justified relative to their environment but not relative to ours. Bergmann argues however that our ordinary concept of justification does not appear to be relativized in this way.

In later work, as Bergmann also points out, in 2003 Sosa  holds that there are two different senses in which a belief might be said to be justified. A belief is “adroit-justified” if the method by which it is formed is reliable in the actual world, and “apt-justified” if the method by which it is formed is reliable in the subject’s world. As Bergmann notes, however, this view does not account for our intuition that there is a single sense in which our beliefs, the alien cognizers’ beliefs, and the demon victims’ beliefs are all justified.

Proper functionalism, by contrast, Bergmann maintains, has no difficulty accommodating these intuitions, since it holds that the relevant environment is the one specified by the design plan (which is the same between us and the demon victims but different for the alien cognizers). Whether Bergmann points to a genuine advantage of his theory over Sosa’s in this regard has, however, been disputed. Markie (2009: 374-377) argues, for example, that Bergmann’s own theory faces disadvantages akin to those he attributes to Sosa’s.

As with most disputed views, the extent to which one is drawn to proper functionalist theories will depend in large measure on which intuitions one has, the relative weight one assigns to them, one’s assessment of how well the theories in question accommodate those intuitions, and whether their rivals do any better. And here one’s mileage may vary. But it is a safe bet that proper functionalist theories will continue to serve as serious contenders for the foreseeable future.

4. References and Further Reading

  • Bergmann, Michael. 2006. Justification Without Awareness: A Defense of Epistemic Externalism (Oxford: Oxford UP).
  • Bergmann, Michael. 2013. “Externalist Justification and the Role of Seemings” Philosophical Studies 166: 163-184.
  • BonJour, Laurence and Sosa, Ernest. 2003. Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues (Malden, MA: Blackwell Publishing).
  • Boyce, Kenneth and Plantinga, Alvin. 2012. “Proper Functionalism” The Continuum Companion to Epistemology, ed. Andrew Cullison (London: Continuum International Publishing Group).
  • Boyce, Kenneth and Moon, Andrew. 2015. “In Defense of Proper Functionalism: Cognitive Science Takes on Swampman,” Synthese Online First: DOI 10.1007/s11229-015-0899-6. http://link.springer.com/ article/10.1007/s11229-015-0899-6.
  • Crisp, Thomas M. “Gettier and Plantinga’s Revised Account of Warrant” Analysis 60: 42-50.
  • Chignell, Andrew. 2003. “Accidentally True Belief and Warrant” Synthese 137: 445-458.
  • Cullison, Andrew. 2013. “Seemings and Semantics” Seemings and Justification, ed. Chris Tucker (Oxford: Oxford UP).
  • Feldman, Richard. 1993. “Proper Functionalism” Nous 27: 34-50.
  • Goldman, Alvin “The Sciences and Epistemology” The Oxford Handbook of Epistemology (Oxford: Oxford UP).
  • Graham, Peter. 2012. “Epistemic Entitlement” Nous 46: 449-482.
  • Graham, Peter. 2014. “Warrant, Functions, History” Naturalizing Epistemic Virtue, eds. Abrol Fairweather and Owen Flanagan (Cambridge: Cambridge University Press).
  • Greco, John. 1993. “Virtues and Vices of Virtue Epistemology” Canadian Journal of Philosophy 23: 413-432.
  • Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity (Cambridge: Cambridge University Press).
  • Kvanvig, Jonathan L. (ed.). 1996 Warrant in Contemporary Epistemology: Essay’s in Honor of Plantinga’s Theory of Knowledge (London: Rowman & Littlefield Publishers).
  • Lehrer, Keith, and Cohen, Stewart. 1983. “Justification, Truth, and Coherence” Synthese 55: 191-207.
  • Long, Todd R. 2012. “Mentalist Evidentialism Vindicated (and a Super-Blooper Epistemic Design Problem for Proper Function justification)” Philosophical Studies 157: 251-266.
  • Markie, Peter. 2009. “Justification and Awareness” Philosophical Studies 146: 361-377.
  • McNabb, Tyler Dalton. 2015. “Warranted Religion: Answering Objections to Alvin Plantinga’s Epistemology” Religious Studies 51: 477-495.
  • Millikan, Ruth. 1984. “Naturalist Reflections on Knowledge” Pacific Philosophical Quarterly, 4:  315-334.
  • Otte, Richard. 1987. “A Theistic Conception of Probability” Faith and Philosophy 4: 427-447.
  • Plantinga, Alvin. 1993a. Warrant: The Current Debate (Oxford: Oxford UP).
  • Plantinga, Alvin. 1993b. Warrant and Proper Function (Oxford: Oxford UP).
  • Plantinga, Alvin. 1993c. “Why We Need Proper Function” Nous 27: 66-82.
  • Spelke, Elizabeth. 1994. “Initial Knowledge: Six Suggestions” Cognition 50: 431-445.
  • Sacks, Oliver. 1970. The Man Who Mistook His Wife for a Hat (New York: HarperCollins).
  • Sosa, Ernest. 1993. “Proper Functionalism and Virtue Epistemology” Nous 27: 51-65.
  • Sosa, Ernest. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Vol. 1 (Oxford: Oxford UP).
  • Sosa, Ernest. 2015. Judgement and Agency (Oxford: Oxford UP).
  • Tucker, Chris. 2011. “Phenomenal Conservatism and Evidentialism” Evidence and Religious Belief, eds. Kelly James Clark and Raymond J. VanArragon (Oxford: Oxford UP).
  • Tucker, Chris. 2014a. “If Dogmatists Have a Problem with Cognitive Penetration, You Do Too” Dialectica 68: 35-62.
  • Tucker, Chris. 2014b. “On What Inferentially Justifies What: The Vices of Reliabilism and Proper Functionalism,” Synthese 191: 3311-3328.
  • Wolterstorff, Nicholas. 2010. “Ought to Believe—Two Concepts” Practices of Belief: Selected Essays, Vol. 2, ed. Terence Cuneo (Cambridge: Cambridge University Press).

 

Author Information

Kenneth Boyce
Email: boyceka@missouri.edu
University of Missouri
U. S. A.

Dynamic Epistemic Logic

This article tells the story of the rise of dynamic epistemic logic. The rise began in the 1960s with the creation and development of epistemic logic, the logic of knowledge, Then in the late 1980s came dynamic epistemic logic, the logic of change of knowledge. Much of it was motivated by puzzles and paradoxes.

The number of active researchers in these logics grows significantly every year because there are so many relations and applications to computer science, to multi-agent systems, to philosophy, and to cognitive science.

The modal knowledge operators in epistemic logic are formally interpreted by employing binary accessibility relations in multi-agent Kripke models (relational structures), where these relations should be equivalence relations to respect the properties of knowledge. The operators for change of knowledge correspond to another sort of modality, more akin to a dynamic modality. A peculiarity of this dynamic modality is that it is interpreted by transforming the Kripke structures used to interpret knowledge, and not, at least not on first sight, by an accessibility relation given with a Kripke model. Although called “dynamic epistemic logic,” this two-sorted modal logic applies to more general settings than the logic of merely S5 knowledge.

The present article discusses in depth the early history of dynamic epistemic logic. It then mentions briefly a number of more recent developments involving factual change, one (of several) standard translations to temporal epistemic logic, and a relation to situation calculus (a well-known framework in artificial intelligence to represent change). Special attention is then given to the relevance of dynamic epistemic logic for belief revision, for speech act theory, and for philosophical logic. The part on philosophical logic pays attention to Moore sentences, the Fitch paradox, and the Surprise Examination.

Table of Contents

  1. Introduction
  2. An Example Scenario
  3. A History of DEL
    1. Announcements
    2. Other Informative Events
  4. DEL and Belief Revision
  5. DEL and Language
  6. DEL and Philosophy
  7. References and Further Reading

1. Introduction

In this overview we tell the story of the rise of dynamic epistemic logic. It is a bit presumptious to call it a rise, but we can only observe this rather peculiar phenomenon. The number of active researchers in these logics grows every year because there are so many relations to computer science, to multi-agent systems, to philosophy, and to cognitive science. It all began with the logic of knowledge in the 1960s, and much of it was motivated by puzzles and paradoxes.

Dynamic logic is the logic of changing knowledge. The starting point of dynamic epistemic logic (DEL) is therefore the logic of knowledge. A founding publication is [42]. We refer to [41] for an overview of epistemic logic and references. A key feature of epistemic logic is that the information state of several agents can be represented by a Kripke model. Given a set of agents and a set of propositional variables, a Kripke model consists of a set of states, a set of accessibility relations (each one a binary relation on the domain), namely one for each agent, and a valuation (that tells which propositional variables are true in which states). In epistemic logic the set of states of a Kripke model is interpreted as a set of epistemic alternatives. The information state of an agent consists of those epistemic alternatives that are possible according to the agent, which is represented by the binary accessibility relation Rα. An agent α knows that a proposition φ is true in a state a of a Kripke model M (M; aKαφ), if and only if that proposition φ is true in all the states that agent α considers possible in that state (that is, which are Rα-accessible from a). A proposition known by agent α may itself pertain to the knowledge of some agent (for instance if one considers the formula KαKβψ). In this way, a Kripke model with accessibility relations for all the agents represents the (higher-order) information of all relevant agents simultaneously.

In DEL, information change is modeled by transforming Kripke models. Since DEL is mostly about information change due to communication, the model transformations usually do not involve factual change. The bare physical facts of the world remain unchanged, but the agents’ information about the world changes. In terms of Kripke models that means that the accessibility relations of the agents have to change (and consequently the set of states of the model might change as well). Modal operators in dynamic epistemic languages denote these model transformations. The accessibility relation associated with these operators is not one within the Kripke model, but pertains to the transformation relation between the Kripke models, as the example in the next section will show.

In Section 2 an example scenario is presented which can be captured by DEL. In Section 3 an historical overview of the main approaches in DEL is presented, with details on their modelling techniques. Section 4 discusses how to model belief revision in DEL. Section 5 connects ideas between speech act theory and DEL. Finally, Section 6 is on the relation between DEL and philosophy: it deals with Moore-sentences, the Fitch-paradox, and the Surprise Examination.

2. An Example Scenario

4

Figure 1: A Kripke model for the situation in which two agents are each given a red or a white card.

Consider the following scenario: Ann and Bob are each given a card that is either red or white on one side (the face side) and nondescript on the other side (the back side). They only see their own card, and so they are ignorant about the other agent’s card. There are four possibilities: both have white cards, both have red cards, Ann has a white card and Bob has a red card, or the other way round. These are the states of the model, and are represented by informative names such as rw, meaning Ann was dealt a red card (r) and Bob was dealt a white card (w). Let us assume that both have red cards, that is, let the actual state be rr. This is indicated by the double lines around state rr in Figure 1. The states of the Kripke model are connected by lines, which are labeled (α or β, denoting Ann or Bob respectively) to indicate that the agents cannot distinguish the states thus connected. (To be complete it should also be indicated that no state can ever be distinguished from itself. For readability these “reflexive lines” are not drawn, but indeed the accessibility relations Rα and Rβ are equivalence relations, since epistemic indistinguishability is reflexive, symmetric and transitive.) In the model of Figure 1 there are no α-lines between those states where Ann has different cards, that is, she can distinguish states at the top, where she has a red card, from those at the bottom, where she has a white one. Likewise, Bob is able to distinguish the left half from the right half of the model. This represents the circumstance that Ann and Bob each know the colour of their own card but not the colour of the other’s card. In the Kripke model of Figure 1 we also see that the higher-order information is represented correctly. Both agents know that the other agent knows the colour of his or her card, and they know that they know this, and so on. It is remarkable that a single Kripke model can represent the information of both agents simultaneously.

rr2Figure 2: A Kripke model for the situation after Ann tells Bob she has a red card.

                                    bb 2Figure 3: A Kripke model for the situation after Ann might have looked at Bob’s card.

Suppose that after picking up their cards, Ann truthfully says to Bob “I have a red card”. The Kripke model representing the resulting situation is displayed in Figure 2. Now both agents know that Ann has a red card, and they know that they know she has a red card, and so on: it is common knowledge among them. (A formula φ is common knowledge among a group of agents if everyone in the group knows that φ, everyone knows that everyone knows that φ, and so on.) Hence there is no need anymore for states where Ann has a white card, so those do not appear in the Kripke model. Note that in the new Kripke model there are no longer any lines labeled β. No matter how the cards were dealt, Bob only considers one state to be possible: the actual one. Indeed, Bob is now fully informed.

Now that Bob knows the colour of Ann’s card, Bob puts his card face down on the table, and leaves the room for a moment. When he returns he considers it possible that Ann took a look at his card, but also that she didn’t. Assuming she did not look, the Kripke model representing the resulting situation is the one displayed in Figure 3. In contrast to the previous model, there are in this model lines for Bob again. This is because he is no longer completely informed about the situation. He does not know whether Ann knows the colour of his card, yet he still knows that both Ann and he have a red card. Only his higher-order information has changed. Ann on the other hand knows whether she has looked at Bob’s card and also knows whether she knows the colour of Bob’s card. She also knows that Bob considers it possible that she knows the colour of his card. In the model of Figure 3 we see that two states representing the same factual information can differ by virtue of the lines connecting them to other states: the state rr on the top and rr on the bottom only differ in higher-order information.

In this section, we have seen two ways in which information change can occur. Going from the first model to the second, the information change was public, in the sense that all agents received the same information. Going from the second to the third model involved information change where not all agents had the same information, because Bob did not know whether Ann looked at his card while he was away. The task of DEL is to provide a logic with which to describe these kinds of information change.

3. A History of DEL

DEL did not arise in a scientific vacuum. The “dynamic turn” in logic and semantics ([72], [34] and [60]) very much inspired DEL, and DEL itself can also be seen as a part of the dynamic turn. The formal apparatus of DEL is a lot like propositional dynamic logic (PDL) [40] and quantified dynamic logic (QDL) [39]. There is also a relation to update semantics (US) [36, 93] — not all formulas are interpreted dynamically, as there, but formulas and updates are clearly distinguished.

The study of epistemic logic within computer science and AI led to the development of epistemic temporal logic (ETL) in order to model information change in multi-agent systems (see [25] and [55]). Rather than model change by modal operators that transform the model, change is modeled by the progression of time in these approaches. Yet the kinds of phenomena studied by ETL and DEL largely overlap.

After this brief sketch of the context in which DEL was developed, the remainder of the section focuses on the development of its two main approaches. The first is public announcement logic, which is presented in Section 3.1. The second, presented in Section 3.2, is the dominant approach in DEL (sometimes identified with DEL).

a. Announcements

The original publication: Plaza The first dynamic epistemic logic, called public announcement logic (PAL), was developed by Plaza in [61]. This was published in 1989. The example where Ann says to Bob that she has a red card is an example of a public announcement. A public announcement is a communicative event where all agents receive the same information and it is common knowledge among them that this is so. The language of PAL is given by the following Backus-Naur Form:

3.1 aaaBesides the usual propositional language, Kαφ is read as agent α knows that φ, and [φ] ψ is read as after φ is announced ψ is the case. In the example above, we could for instance translate “After it is announced that Ann has a red card, Bob knows that Ann has a red card” as [rα]Kβrα.

An announcement is modeled by removing the states where the announcement is false, that is, by going to a submodel. This model transformation is the main feature of PAL’s semantics.

 

3.1 bbbb

In clause (v) the condition that the announced formula be true at the actual state entails that only truthful announcements can take place. The model MΙφ is the model obtained from M by removing all non-φ states. The new set of states consists of the φ-states of M. Consequently, the accessibility relations as well as the valuation are restricted to these states . The propositional letters true at a state remain true after an announcement. This reflects the idea that communication can only bring about information change, not factual change.

Gerbrandy and Groeneveld’s approach A logic similar to PAL was developed independently by Gerbrandy and Groeneveld in [32], which is more extensively treated in Gerbrandy’s PhD thesis [30]. There are three main differences between this approach and Plaza’s approach. First of all, Gerbrandy and Groeneveld do not use Kripke models in the semantics of their language. Instead, they use structures called possibilities which are defined by means of non-wellfounded set theory [1], a branch of set theory where the foundation axiom is replaced by another axiom. Possibilities and Kripke models are closely linked: possibilities correspond to bisimulation classes of Kripke models [18]. Later, Gerbrandy provided semantics without using non-wellfounded set theory for a simplified version of his public announcement logic [31].

The second difference is that Gerbrandy and Groeneveld also consider announcements that are not truthful. In their view, a logic for announcements should model what happens when new information is taken to be true by the agents. Hence, according to them, what happens to be true deserves no special status. This is more akin to the notion of update in US. In terms of Kripke models this means that by updating, agents may no longer consider the actual state to be possible, that is, Rα may no longer be reflexive. In a sense it would therefore be more accurate to call this logic a dynamic doxastic logic (a dynamic logic of belief) rather than a dynamic epistemic logic, since according to most theories, knowledge implies truth, whereas beliefs need not be true.

Thirdly, their logic is more general in the sense that subgroup announcements are treated (where only a subgroup of the group of all agents acquires new information); and especially private announcements are considered, where only one agent gets information. These announcements are modeled in such a way that the agents who do not receive information do not even consider it possible that anyone has learned anything. In terms of Kripke models, this is another way in which Rα may lose reflexivity.

Adding common knowledge Semantics for public, group and private announcements using Kripke models was proposed by Baltag, Moss, and Solecki in [14]. This semantics is equivalent to Gerbrandy’s semantics (as was shown in [58]). The main contribution in [14] to PAL was that their approach also covered common knowledge, which is an important concept when one is interested in higher-order information and plays an important role in social interaction (cf. [92]). The inclusion of common knowledge poses a number of technical problems.

b. Other Informative Events

Groeneveld and Gerbrandy’s approach In addition to a logic for announcements Gerbrandy also developed a system for more general information change involving many agents, each of whom may have a different perspective. This is for instance the case when Ann may look at Bob’s card.

In order to model this information change it is important to realize that distinct levels of information are not distinctly represented in a Kripke model. For instance what Ann actually knows about the cards depends on Rα, but what Bob knows about what Ann knows about the cards depends on Rα as well. Therefore changing something in the Kripke model, such as cutting a line, changes the information on many levels. In order to come to grips with this issue it really pays to use non-wellfounded semantics. One of the ways to think about the possibilities defined by Gerbrandy and Groeneveld is as infinite trees. In such a tree, distinct levels of information are represented by certain paths in the tree. By manipulating the appropriate part of the tree, one can change the agents’ information at the appropriate level. This insight stems from Groeneveld [37] and was also used by Renardel de Lavalette in [62], who introduces treelike lean modal structures using ordinary set theory in the semantics of a dynamic epistemic logic.

Van Ditmarsch’s approach Inspired by Gerbrandy and Groeneveld’s work, Van Ditmarsch developed a dynamic epistemic logic for modeling information change in knowledge games, where the goal of the players is to obtain knowledge of some aspect of the game. Clue and Battleships are typical examples of knowledge games. Players are never deceived in such games and therefore the dynamic epistemic logic of Gerbrandy and Groeneveld in which reflexivity might be lost, seems unsuitable. In Van Ditmarsch’s Ph.D. thesis [86], a logic is presented where all model transformations are from Kripke models with equivalence relations to Kripke models with equivalence relations, which is thus tailored to information change involving knowledge. This approach was further streamlined by Van Ditmarsch in [87] and later extended to include concurrent actions (when two or more events occur at the same time) in [90]. One of the open problems of these logics is that a completeness proof for the axiom systems has not been obtained.

The dominant approach: Baltag, Moss and Solecki Another way of modeling complex informative events was developed by [14], which has become the dominant approach in DEL. Their approach is highly intuitive and is lying at the basis of many papers in the field: indeed, many refer to this approach simply as DEL. Their key insight was that information changing events can be modeled in the same way as situations involving information. Given a situation, such as when Ann and Bob each have a card, one can easily provide a Kripke model for such a situation. One simply considers which states might occur and which of those states the agents cannot distinguish. One can do the same with events involving information. Given a scenario, such as Ann possibly looking at Bob’s card, one can determine which events might occur: either she looks and sees it is red (she learns that) or she sees that it is white (she learns that ), or she does not look at the card (she learns nothing new, indicated by the tautology Τ). It is clear that Ann can distinguish these particular events, but Bob cannot. Such models are called action models or event models.

An event model A is a triple 1 consisting of a set of events E, a binary relation 2 over E for each agent, and a precondition function 3 which assigns a formula to each event. This precondition determines under what circumstances the event can actually occur. Ann can only truthfully say that she has a red card, if in fact she does have a red card. The event model for the event where Ann might have looked at Bob’s card is

t2

Figure 4: An event model for when Ann might look at Bob’s card.

t2

Figure 5: The product update for the models of Figure 3 and Figure 4.

given in Figure 4, where each event is represented by its precondition.

The Kripke model of the situation following the event is constructed with a procedure called a product update. For each state in the original Kripke model one determines which events could take place in that state (that is, one determines whether the precondition of the event is true at that state). The set of states of the new model consists of those pairs of states and events (a, e), which represent the result of event e occurring in state a. The new accessibility relation is now easy to determine. If two states were indistinguishable to an agent and two events were also indistinguishable to that agent, then the result of those events taking place in those states should also be indistinguishable. This implication also holds the other way round: if the result of two events happening in two states are indistinguishable, then the original states and events should be indistinguishable as well. (Van Benthem [73] characterizes product update as having perfect recall, no miracles, and uniformity .) The basic facts about the world do not change due to a

comp tFigure 6: The product update for the models of Figure 1 and Figure 4.

merely communicative event. And so the valuation in <a, e> simply follows the old valuation in a.

longThe model in Figure 5 is the result of a product update of the model in Figure 2 and the event model of Figure 4. One can see that this is the same as the model in Figure 3 (except for the names of the states), which indicates that product update yields the intuitively right result.

One may wonder whether the model in Figure 4 represents the event accurately. According to the event model Bob considers it possible that Ann looks at his card and sees that it is white. Bob, however, already knows that the card is red, and therefore should not consider this event possible. This criticism is justified and one could construct an event model that takes this into account, but the beauty of the event model is precisely that it is detached from the agents’ information about the world in such a way that it provides an accurate model of just the information the agents have about the event. This means that product update yields the right outcome regardless of the Kripke model of the situation in which the event occurred. For instance taking the product update with the model of Figure 1, yields the Kripke model depicted in Figure 6, which represents the situation where Ann might look at Bob’s card immediately after the cards were dealt. The resulting model also represents that situation correctly. This indicates that in DEL static information and dynamic information can be separated.

In the logical language of DEL these event models appear as modalities [A, e], where e is taken to be the event that actually occurs. The language is given by the following Backus-Naur Form

gestClauses (i)–(iv) are the same as for PAL. In clause (v) hmmm is the reflexive transitive closure of the union of the accessibility relations of members of Γ. Clause (vi) is a standard clause for dynamic modalities, except that the accessibility relation for dynamic modalities is a relation on the class of all Kripke models. In clause (vii) it is required that the precondition of the event model is true in the actual state, thus ensuring that <a, e>, the new actual state, exists in the product update. Clauses (viii) and (ix) are the usual semantics for non-deterministic choice and sequential composition.

Not only informative events where different agents have a different perspective can be modeled in DEL, but also public announcements can be thought of in terms of event models. A public announcement can be modeled by an event model containing just one event: the announcement. All agents know this is the actual event, so it is the only event considered possible. Indeed, DEL is a generalization of PAL.

Criticism, alternatives, and extensions Many people feel somewhat uncomfortable with having models as syntactical objects. Baltag and Moss have tried to accommodate this by proposing different languages while maintaining an underlying semantics using event models [10, 13]. This issue is extensively discussed in [91, Section 6.1]. There are alternatives using hybrid logic [70], and algebraic logic ([11], [12]). Most papers just use event models in the language.

DEL has been extended in various ways. Operators for factual change [85, 81] and past operators from temporal logic have been added [64, 5]. DEL has been combined with probability [45], justification logic [63] and extended such that belief revision is also within its grasp. Connections have been made between DEL and various other logics. Its relation to PDL, ETL [80], AGM belief revision, and situation calculus [83] has been studied. DEL has been applied to a number of puzzles and paradoxes from recreational mathematics and philosophy. It has also been applied to problems in game theory (see [79] for a very detailed survey), as well as issues in computer security [94]. Complexity and succinctness of DEL has been investigated in [54, 69, 6]. Two recent overviews of DEL are [17, 78]. In the next section we pay attention to DEL and belief revision.

4. DEL and Belief Revision

Something you cannot model in DEL is changing your mind. Once you know a fact, you know it forever, that is, once kap is true, it remains true after every update. Even when we have weaker constraints on the accessibility relations (for belief, or even general accessibility), this remains the case. But sometimes, when you believe a fact, you change your mind, and you may come to believe the opposite. This is not shocking or anything, it might have been that you merely did not believe it firmly. This means a change of kap into ka-por, using the better suited belief modality ba for that: a change of bap into ba-p. In a different community, that of (AGM) belief revision, this is the most natural operation around—indeed called ‘belief revision’. In this section we shortly survey interactions between such AGM belief revision and dynamic epistemic logic.

Belief revision has been studied from the perspective of structural properties of reasoning about changing beliefs [29], from the perspective of changing, growing and shrinking knowledge bases, and from the perspective of models and other structures of belief change wherein such knowledge bases may be interpreted, or that satisfy assumed properties of reasoning about beliefs. A typical approach involves preferential orders to express increasing or decreasing degrees of belief [48, 56], where these works refer to the ‘systems of spheres’ in [51, 38]. Within this tradition multi-agent belief revision has also been investigated, for example, belief merging [46]. Belief operators are normally not explicit in the logical language, so that higher-order beliefs (I know that you are ignorant of a certain proposition) cannot be formalized. Iterated belief revision may also be problematic.

The link between belief revision and modal logic, that is, explicit belief modalities and belief change modalities in the logical language, was made in a strand of research known as dynamic doxastic logic. This was proposed and investigated by Segerberg and collaborators in works such as [68, 52, 67, 22]. These works are distinct from other approaches to belief revision in modal logics, without dynamic modal operators, such as [19, 50, 20], that also influenced the development of dynamic logics combining knowledge and belief change. In dynamic doxastic logics belief operators are in the logical language, and belief revision operators are dynamic modalities. Higher-order belief change, that is, to revise one’s beliefs about one’s own or other agents’ beliefs and ignorance, are considered problematic in dynamic doxastic logic, see [52]. In [68, 67] belief revision is restricted to propositional formulas (factual revision). There are

dynamic doxastic logics wherein [*φ] merely means belief revision with  φ according to some externally defined strategy, as in AGM style (this is the general setup in [68], not unlike the nonepistemic/doxastic modal setup in [71]), but there are also dynamic doxastic logics, such as [67], wherein [*φ] is a recipe operating on a semantic structure and outputting a novel structure, the standard approach in dynamic epistemic logic.

Belief revision in dynamic epistemic logic was initiated in [4, 88, 77, 15]. From these, [4, 88] propose a treatment involving degrees of belief and based on degrees of plausibility among states in structures interpreting such logics, so-called quantitative dynamic belief revision; whereas [77, 15] propose a treatment involving comparative statements about plausibilities (a binary relation between states denoting more/less plausible), so-called qualitative dynamic belief revision. The latter is clearly more suitable for logics of belief revision, and for notions such as conditional belief. The analogue of the AGM postulate of success must be given up when one incorporates higher-order belief change as in dynamic epistemic logic, where again a prime mover are Moore-sentences of the form ‘proposition p is true but you don’t know it’, which cannot after acceptance be believed by you. Many more works on dynamic belief revision have appeared since, for example, [33, 53, 24]. A prior, independent, strand to model belief revision was in temporal epistemic logic, and was initiated in the mid 1990s in [28]. Their integrated treatment of belief, knowledge, plausibilities, and change is similar to the more recent developments to model belief revision in dynamic epistemic logic, and the relation between the two approaches is incompletely understood.

For an example of belief revision in dynamic epistemic logic, consider one agent and a proposition p that the agent is uncertain about. The agent could be Ann, who is uncertain whether Bob has a red card, as in the proposition ro b before. We get a Kripke model depicted in Figure 7, not dissimilar from that in Figure 2. There are two states of the world, one where p is false and another one where p is true. Let us suggestively call them 0 and 1, respectively. The agent has epistemic preferences among these states. Namely, she considers it most plausible that 1 is the actual state, that is,, that p is true, and less plausible that 0 is the actual state. We write 1 < 0 where, as common in the area, the minimal element in the order is the most plausible state (and not, as maybe to be expected, the least plausible state). Let us further assume that p is false.

ppppFigure 7: Ann believes p but considers -p epistemically possible.

The agent believes a proposition when it holds in the most plausible states. For example, she believes that p is true. This is formalized as

bap

We write ba (for belief) instead of ka(for knowledge) as beliefs may be mistaken. Indeed, the agent believes that p but in fact p is false! But we also distinguish a modality for knowledge.

The agent knows a proposition when it holds in all plausible states. These are her strongest beliefs, or knowledge. In the case of this example her factual knowledge only involves tautologies such as p ∨-p This is described as

ka (p ∨-p)

Now imagine that the agent wants to revise her current beliefs. She believes that p is true, but has been given sufficient reason to be willing to revise her beliefs with -p instead. We can accomplish that when we allow a model transformation that makes the 0 state more plausible than the 1 state. There are various ways to do that. In this simple example we can simply observe that it suffices to make the state satisfying the revision formula -p, that is, 0, more plausible than the other state, 1. See Figure 8. As a consequence of that, the agent now believes -p: ba-p is true. Therefore, the revision was successful. This can already be expressed in the initial situation by using a dynamic modal operator [*-p] for the relation induced by the program “belief revision with -p”, followed by what should hold after that program is executed. In this dynamic modal setting we can then write that

-pΛbapΛ[*-p]ba-p

was already true at the outset.

lengFigure 8: Ann revises her belief with -p

In dynamic epistemic logic, unlike in the original AGM or the subsequent DDL setting, beliefs and knowledge can also be about modal formulas. For example, we not only have that bap, but we also have that bapppp the agent believes that she does not know whether p. We might say: Ann is aware that her belief in p is not very strong, that it is defeasible.

5. DEL and Language

Consider the connection between DEL and speech act theory. Speech act theory started with the work of [7], who argued that language is used to perform all sorts of actions; we make promises, we ask questions, we issue commands, and so forth. An example of a speech act is a bartender who says “The bar will be closed in five minutes” [8]. Austin distinguishes three kinds of acts that are performed by the bartender (i) the locutionary act of uttering the words, (ii) the illocutionary act of informing his clientele that the bar will close in five minutes, and (iii) the perlocutionary act of getting the clientele to order one last drink and leave.

Truth conditions, which determine whether an indicative sentence is true of false, are generalized to success conditions to determine whether a speech act is successful or not. In speech act theory there are several distinctions when it comes to the ways in which something can be wrong with a speech act [7, p. 18]. Here we do not make such distinctions and simply speak of success conditions. Searle gives in [66, p. 66] the following success conditions, among others, for an assertion that p by speaker S to hearer H:

  • S has evidence (reasons, and so forth) for the truth of p.
  • It is not obvious to both S and H that H knows (does not need to be reminded of, and so forth) p.
  • S believes p

Speech act theory has been embraced by the multi-agent systems community, for example, by the Foundation for Intelligent Physical Agents (FIPA). FIPA is an IEEE Computer Society standards organization that promotes agent-based technology and the interoperability of its standards with other technologies. It published a Communicative Act Library Specification [26] that includes a specification of the inform action, which is similar to Searle’s analysis of assertions.

It is worthwhile to join this analysis of assertions to the analysis of public announcements in PAL. It is clear from the list of success conditions that one usually only announces what one believes (or knows) to be true. So, an extra precondition for an announcement that φ by an agent α , should be that kap. Public announcements are indeed modeled in this way in [61].

As an example, consider the case when Ann tells Bob she has a red card: it is more appropriate to model this as an announcement that kara rather than the announcement that . Fortunately, these formulas were equivalent in the model under consideration. Suppose that Ann had said “We do not both have white cards”. When this is modeled as an announcement that wa, we obtain the model in Figure 9(a). However, Ann only knows this statement to be true when she in fact has a red card herself. Indeed, when we look at the result of the announcement that kawa we obtain the model in Figure 9(b). We see that the result of this

finished

Figure 9: An illustration of the difference between the effect of the announcement that φ and the announcement that kaφ and an announcement that only changes the agents’ higher-order information

announcement is the same as when Ann says that she has a red card (see Figure 2). By making presuppositions part of the announcement, we are in a way accommodating the precondition (see also [44]).

The second success condition in Searle’s analysis conveys that an announcement ought to provide the hearer with new information. In the light of DEL, one can revise this second success condition by saying that p is not common knowledge, thus taking higher-order information into account. It seems natural to assume that a speaker wants to achieve common knowledge of p, since that plays an important role in coordinating social actions; and so lack of common knowledge of p is a condition for the success of announcing p.

Consider the situation where Ann did look at Bob’s card when he was away and found out that he has a red card (Figure 9(c)). Suppose that upon Bob’s return Ann tells him “I do not know that you have a white card”. Both Ann and Bob already know this, and they also both know that they both know it. Therefore Searle’s second condition is not fulfilled, and so according to his analysis there is something wrong with Ann’s assertion. The result of this announcement is given in Figure 9(d). We see that the information of the agents has changed. Now Bob no longer considers it possible that Ann considers it possible that Bob considers it possible that Ann knows that Bob has a white card. And so the announcement is informative. One can give more and more involved examples to show that indeed change of common knowledge is a more natural requirement for announcements than Searle’s second condition, especially multi-agent scenarios.

Van Benthem [76] analyzes question and answer episodes using DEL. One of the success conditions of questions as speech acts is that the speaker does not know the answer [66, p. 66]. Therefore posing a question can reveal crucial information to the hearer in such a way that the hearer only knows the answer after the question has been posed ([74],[91, p. 61],[82]).

Professor a is program chair of a conference on Changing Beliefs. It is not allowed to submit more than one paper to this conference, a rule all authors of papers did abide to (although the belief that this rule makes sense is gradually changing, but this is besides the point here). Our program chair a likes to have all decisions about submitted papers out of the way before the weekend, since on Saturday he is due to travel to attend a workshop on Applying Belief Change. Fortunately, although there appears not to be enough time to notify all authors, just before he leaves for the workshop, his reliable secretary assures him that she has informed all authors of rejected papers, by personally giving them a call and informing them about the sad news concerning their paper.

Freed from this burden, Professor a is just in time for the opening reception of the workshop, where he meets the brilliant Dr. b. The program chair remembers that b submitted a paper to Changing Beliefs, but to his own embarrassment he must admit that he honestly cannot remember whether it was accepted or not. Fortunately, he does not have to demonstrate his ignorance to b, because b’s question ‘Do you know whether my paper has been accepted?’ does make a reason as follows: a is sure that would b’s paper have been rejected, b would have had that information, in which case b had not shown his ignorance to a. So, instantaneously, a updates his belief with the fact that b’s paper is accepted, and he now can answer truthfully with respect to this new revised belief set.

This phenomenon shows that when a question is regarded as a request [49], the success condition that the hearer is able to grant the request, that is, provide the answer to the question, must be fulfilled after the request has been made, and not before. (However, it is not commonly agreed upon in the literature that questions can be regarded as requests (cf. [35, Section 3].) This analysis of questions in DEL fits well within the broad interest in questions in dynamic semantics [3]. Recent work on DEL and questions is [2, 59, 23].

6. DEL and Philosophy

The role of public announcements as typical informative speech acts focussed the attention on a number of situations wherein that form of success cannot be achieved. This has been investigated mainly within philosophical logic, under the heading of ‘Moore sentences’ and the ‘Fitch paradox’. The ‘Moore sentence’ was introduced by Moore in [57] and his original analysis is that p ¬Kp (p is true and I don’t know/believe it) cannot sincerely be uttered. As this is an informative speech act, you are supposed to believe your beliefs. It seems incoherent, and maybe even paradoxical, to believe a proposition stating that you do not believe it. In the DEL setting we can give this a dynamic interpretation. It is then no longer paradoxical.

If I tell you “You don’t know that I play cello”, this has the conversational implicature “You don’t know that I play cello and it is true that I play cello”. This has the form p ¬Kp. Suppose I were tell you again “You don’t know that I play cello.” Then you can respond: “You’re lying. You just told me that you play cello.” We can analyze what is going on here in modal logic. We model your uncertainty, for which a single epistemic modality suffices. Initially, there are two possible worlds, one in which p is true and another one in which p is false, and that you cannot distinguish from one another. Although in fact p is true, you don’t know that: p ¬Kp. The announcement of p ¬Kp results in a restriction of these two possibilities to those where the announcement is true: in the p-world, p ¬Kp is true, but in the :p-world, p ¬Kp is false.

In the model restriction consisting of the single world where p is true, p is known: Kp. Given that Kp is true, so is¬p ∨ Kp, and ¬p ∨ Kp is equivalent to ¬(p ∧ ¬Kp), the negation of the announced formula. So, announcement of p ∧ ¬Kp makes it false! Gerbrandy [30, 31] calls this phenomenon an unsuccessful update; the matter is also taken up in [89, 43, 84].

We continue with some words on the Fitch paradox [27]. A standard analysis of the Fitch paradox is as follows – see the excellent review of the literature on Fitch’s paradox in the Stanford Encyclopedia of Philosophy [21], and the volume dedicated on knowability [65]. The existence of unknown truths is formalized as ∃p (p ∧ ¬Kp). The requirement that all truths are th-knowable is formalized as ∀p (p → ◊ Kp), where ◊ formalizes the existence of some process after which p is known, or an accessible world in which p is known. Fitch’s paradox is that the existence of unknown truths is inconsistent with the requirement that all truths are knowable.

The Moore-sentence ∧ ¬ Kp witnesses the existential statement ∃p (p ∧ ¬Kp). Assume that it is true. From ∃p (p ∧ ¬Kp) follows the truth of its instance (p ∧ ¬Kp) → ◊ K(p ∧ ¬Kp), and from that and p ∧ ¬Kp follows ◊ K(p ∧ ¬Kp). Whatever the interpretation of ◊, it results in having to evaluate K(p ∧ ¬Kp). But this is inconsistent for knowledge and belief.

We now get to the relation between knowable and DEL. The suggestion to interpret ‘knowable’ as ‘known after an announcement’ was made by van Benthem in [75], and [9] proposes a logic where ‘φ is knowable’ is interpreted in that way. In this setting, ◊p stands for ‘there is an announcement after which p (is true)’, so that ◊Kp stands for ‘there is an announcement after which p is known’, which is a form of ‘proposition p is knowable’.

For example, consider the proposition p for ‘it rains in Liverpool’. Suppose you are ignorant about p: ¬(KpK¬p). First, suppose that p is true. I can announce to you here and now that it is raining in Liverpool (according to your expectations, maybe…), after which you know that: 〈 p Kp stands for ‘p is true and after announcing p, p is known’ (〈φ〉 is the dual of [φ], that is, 〈φ〉ψ  is defined by abbreviation as ¬[φ]¬ψ ). Now, suppose that p is false. In a similar way, after I announce that, you know that; so that we have 〈¬p〉 K¬p. If you already knew whether p, having its value announced does not have any informative consequence for you. Therefore, 〈p〉 K∨ 〈¬p〉 K ¬is a validity. Therefore we also have〈p〉 (K∨  K ¬p) ∨ 〈¬p〉 (K∨  K ¬p) . We can generalize the statement ‘there is a proposition p such that after its announcement, p is known’, to ‘there exists a proposition q, such that after its announcement, p is known’, where q is not necessarily the same as p. Then we have informally captured the meaning of ◊Kp. In other words, this operator is a quantification over announcements. But we have then just proved that ◊ (K∨  K ¬p)is a validity. For more on such matters, see [9, 84].

Another paradox in philosophical logical circles that has been analyzed with DEL methods (and that has similar ‘Moore sentences’-like symptoms) is the Surprise Examination. This has been investigated in works as [30, 31, 89], and more recently by Baltag and Smets using plausibility epistemic structures, along the lines of [16].

Parts of the materials for this overview have been taken from [88, 47, 84], and subsequently revised to make it into a single comprehensive text.

7. References and Further Reading

  • [1] P. Aczel. Non-Well-Founded Sets. CSLI Publications, Stanford, CA, 1988. CSLI Lecture Notes 14.
  • [2] T. Agotnes, J. van Benthem, H. van Ditmarsch, and S. Minica. Question-Answer games, ˚ 2011.
  • [3] M. Aloni, A. Butler, and P. Dekker, editors. Questions in Dynamic Semantics. Elsevier, Amsterdam, 2007.
  • [4] G. Aucher. A combined system for update logic and belief revision. In Proc. of 7th PRIMA, pages 1–17. Springer, 2005. LNAI 3371.
  • [5] G. Aucher and A. Herzig. From DEL to EDL : Exploring the power of converse events. In K. Mellouli, editor, Proc. of ECSQARU, LNCS 4724, pages 199–209. Springer, 2007.
  • [6] G. Aucher and F. Schwarzentruber. On the complexity of dynamic epistemic logic. In Proc. of 14th TARK, 2013.
  • [7] J. L. Austin. How to Do Things with Words. Clarendon Press, Oxford, 1962.
  • [8] K. Bach. Speech acts. In E. Craig, editor, Routledge Encyclopedia of Philosophy, volume 8, pages 81–87. Routledge, London, 1998.
  • [9] P. Balbiani, A. Baltag, H. van Ditmarsch, A. Herzig, T. Hoshi, and T. De Lima. ‘Knowable’ as ‘known after an announcement’. Review of Symbolic Logic, 1(3):305–334, 2008.
  • [10] A. Baltag. A logic for suspicious players: epistemic actions and belief-updates in games. Bulletin of Economic Research, 54(1):1–45, 2002.
  • [11] A. Baltag, B. Coecke, and M. Sadrzadeh. Algebra and sequent calculus for epistemic actions. Electronic Notes in Theoretical Computer Science, 126:27–52, 2005.
  • [12] A. Baltag, B. Coecke, and M. Sadrzadeh. Epistemic actions as resources. J. of Logic Computat., 17(3):555–585, 2007.
  • [13] A. Baltag and L. S. Moss. Logics for epistemic programs. Synthese, 139:165–224, 2004.
  • [14] A. Baltag, L. S. Moss, and S. Solecki. The logic of public announcements, common knowledge, and private suspicions. In I. Gilboa, editor, Proceedings of TARK 98, pages 43–56, 1998.
  • [15] A. Baltag and S. Smets. A qualitative theory of dynamic interactive belief revision. In Proc. of 7th LOFT, Texts in Logic and Games 3, pages 13–60. Amsterdam University Press, 2008.
  • [16] A. Baltag and S. Smets. Group belief dynamics under iterated revision: fixed points and cycles of joint upgrades. In Proc. of 12th TARK, pages 41–50, 2009.
  • [17] A. Baltag, H. van Ditmarsch, and L.S. Moss. Epistemic logic and information update. In J. van Benthem and P. Adriaans, editors, Handbook on the Philosophy of Information, pages 361–456, Amsterdam, 2008. Elsevier.
  • [18] P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, Cambridge, 2001. Cambridge Tracts in Theoretical Computer Science 53.
  • [19] O. Board. Dynamic interactive epistemology. Games and Economic Behaviour, 49:49–80, 2004.
  • [20] G. Bonanno. A simple modal logic for belief revision. Synthese (Knowledge, Rationality & Action), 147(2):193–228, 2005.
  • [21] B. Brogaard and J. Salerno. Fitch’s paradox of knowability, 2004. http://plato. stanford.edu/archives/sum2004/entries/fitch-paradox/.
  • [22] J. Cantwell. Some logics of iterated belief change. Studia Logica, 63(1):49–84, 1999.
  • [23] I. Ciardelli and F. Roelofsen. Inquisitive dynamic epistemic logic. Manuscript, 2013.
  • [24] C. Degremont. ´ The Temporal Mind. Observations on the logic of belief change in interactive systems. PhD thesis, University of Amsterdam, 2011. ILLC Dissertation Series DS-2010-03.
  • [25] R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. MIT, Cambridge, Massachusetts, 1995.
  • [26] FIPA. FIPA communicative act library specification, 2002. http://www.fipa.org/.
  • [27] F.B. Fitch. A logical analysis of some value concepts. The Journal of Symbolic Logic, 28(2):135–142, 1963.
  • [28] N. Friedman and J.Y. Halpern. A knowledge-based framework for belief change – part i: Foundations. In Proc. of 5th TARK, pages 44–64. Morgan Kaufmann, 1994.
  • [29] P. Gardenfors. ¨ Knowledge in Flux: Modeling the Dynamics of Epistemic States. Bradford Books, MIT Press, Cambridge, MA, 1988. 17
  • [30] J. Gerbrandy. Bisimulations on Planet Kripke. PhD thesis, University of Amsterdam, 1998. ILLC Dissertation Series DS-1999-01.
  • [31] J. Gerbrandy. The surprise examination in dynamic epistemic logic. Synthese, 155(1):21– 33, 2007.
  • [32] J. Gerbrandy and W. Groeneveld. Reasoning about information change. J. Logic, Lang., Inform., 6:147–169, 1997.
  • [33] P. Girard. Modal logic for belief and preference change. PhD thesis, Stanford University, 2008. ILLC Dissertation Series DS-2008-04.
  • [34] P. Gochet. The dynamic turn in twentieth century logic. Synthese, 130(2):175–184, 2002.
  • [35] J. Groenendijk and M. Stokhof. Questions. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 1055–1124. Elsevier, Amsterdam, 1997.
  • [36] J. Groenendijk, M. Stokhof, and F. Veltman. Coreference and modality. In S. Lappin, editor, The Handbook of Contemporary Semantic Theory, pages 179–213. Blackwell, Oxford, 1996.
  • [37] W. Groeneveld. Logical Investigations into Dynamic Semantics. PhD thesis, University of Amsterdam, 1995. ILLC Dissertation Series DS-1995-18.
  • [38] A. Grove. Two modellings for theory change. Journal of Philosophical Logic, 17:157–170, 1988.
  • [39] D. Harel. First-Order Dynamic Logic. LNCS 68. Springer, 1979.
  • [40] D. Harel. Dynamic logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume II, pages 497–604, Dordrecht, 1984. Kluwer Academic Publishers.
  • [41] Vincent Hendricks and John Symons. Epistemic logic. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Spring 2006.
  • [42] J. Hintikka. Knowledge and Belief. Cornell University Press, Ithaca, NY, 1962.
  • [43] W. Holliday and T. Icard. Moorean phenomena in epistemic logic. In L. Beklemishev, V. Goranko, and V. Shehtman, editors, Advances in Modal Logic 8, pages 178–199. College Publications, 2010.
  • [44] J. Hulstijn. Presupposition accommodation in a constructive update semantics. In G. Durieux, W. Daelemans, and S. Gillis, editors, Proceedings of CLIN VI, 1996.
  • [45] J. Gerbrandy J. van Benthem and B. Kooi. Dynamic update with probabilities. Studia Logica, 93(1):67–96, 2009.
  • [46] S. Konieczny and R. Pino Perez. Merging information under constraints: A logical frame- ´ work. Journal of Logic and Computation, 12(5):773–808, 2002.
  • [47] B.P. Kooi. Dynamic epistemic logic. In J. van Benthem and A. ter Meulen, editors, Handbook of Logic and Language, pages 671–690. Elsevier, 2011. Second edition.
  • [48] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44:167–207, 1990.
  • [49] R. Lang. Questions as epistemic requests. In H. Hiz, editor, ˙ Questions, pages 301–318. Reidel, Dordrecht, 1978.
  • [50] N. Laverny. Revision, mises ´ a jour et planification en logique doxastique graduelle ` . PhD thesis, Institut de Recherche en Informatique de Toulouse (IRIT), Toulouse, France, 2006.
  • [51] D.K. Lewis. Counterfactuals. Harvard University Press, Cambridge (MA), 1973.
  • [52] S. Lindstrom and W. Rabinowicz. DDL unlimited: dynamic doxastic logic for introspective ¨ agents. Erkenntnis, 50:353–385, 1999.
  • [53] F. Liu. Changing for the Better: Preference Dynamics and Agent Diversity. PhD thesis, University of Amsterdam, 2008. ILLC Dissertation Series DS-2008-02.
  • [54] C. Lutz. Complexity and succinctness of public announcement logic. In Proceedings AAMAS 06, Hakodate, Japan, 2006.
  • [55] J.-J. Ch. Meyer and W. van der Hoek. Epistemic Logic for AI and Computer Science. Cambridge University Press, Cambridge, 1995.
  • [56] T.A. Meyer, W.A. Labuschagne, and J. Heidema. Refined epistemic entrenchment. Journal of Logic, Language, and Information, 9:237–259, 2000.
  • [57] G.E. Moore. A reply to my critics. In P.A. Schilpp, editor, The Philosophy of G.E. Moore, pages 535–677. Northwestern University, Evanston IL, 1942. The Library of Living Philosophers (volume 4).
  • [58] L. S. Moss. From hypersets to Kripke models in logics of announcements. In J. Gerbrandy, M. Marx, M. de Rijke, and Y. Venema, editors, JFAK. Essays Dedicated to Johan van Benthem on the Occasion of his 50th Birthday, Amsterdam, 1999. Amsterdam University Press.
  • [59] Michal Peli and Ondrej Majer. Logic of questions and public announcements. In Nick Bezhanishvili, Sebastian Lbner, Kerstin Schwabe, and Luca Spada, editors, Logic, Language, and Computation, pages 145–157. Springer, 2011. LNCS 6618.
  • [60] J. Peregrin, editor. Meaning: the dynamic turn. Elsevier, Amsterdam, 2003. 19
  • [61] J. Plaza. Logics of public communications. Synthese, 158(2):165–179, 2007. This paper was originally published as Plaza, J. A. (1989). Logics of public communications. In M. L. Emrich, M. S. Pfeifer, M. Hadzikadic, and Z.W. Ras (Eds.), Proceedings of ISMIS: Poster session program (pp. 201–216). Publisher: Oak Ridge National Laboratory, ORNL/DSRD- 24.
  • [62] G. R. Renardel de Lavalette. Changing modalities. J. Logic and Comput., 14(2):253–278, 2004.
  • [63] B. Renne. A survey of dynamic epistemic logic. manuscript, 2008.
  • [64] J. Sack. Adding Temporal Logic to Dynamic Epistemic Logic. PhD thesis, Indiana University, Bloomington, USA, 2007.
  • [65] J. Salerno, editor. New Essays on the Knowability Paradox. Oxford University Press, Oxford, UK, 2009. [66] J. R. Searle. Speach Acts, An Essay in the Philosophy of Language. Cambridge University Press, Cambridge, 1969.
  • [67] K. Segerberg. Irrevocable belief revision in dynamic doxastic logic. Notre Dame Journal of Formal Logic, 39(3):287–306, 1998.
  • [68] K. Segerberg. Two traditions in the logic of belief: bringing them together. In H. J. Ohlbach and U. Reyle, editors, Logic, Language, and Reasoning, pages 135–147, Dordrecht, 1999. Kluwer.
  • [69] P. Iliev T. French, W. van der Hoek and B. Kooi. On the succinctness of some modal logics. Artificial Intelligence, 197:56–85, 2013.
  • [70] B. D. ten Cate. Internalizing epistemic actions. In M. Martinez, editor, Proceedings of the NASSLLI 2002 student session, pages 109 – 123, Stanford University, 2002.
  • [71] J. van Benthem. Semantic parallels in natural language and computation. In Logic Colloquium ’87, Amsterdam, 1989. North-Holland.
  • [72] J. van Benthem. Exploring Logical Dynamics. CSLI Publications, Stanford, 1996.
  • [73] J. van Benthem. Games in dynamic-epistemic logic. Bulletin of Economic Research, 53(4):219–248, 2001.
  • [74] J. van Benthem. Logics for information update. In J. van Benthem, editor, Proceedings of TARK 2001, pages 51–67, San Francisco, 2001. Morgan Kaufmann.
  • [75] J. van Benthem. What one may come to know. Analysis, 64(2):95–105, 2004.
  • [76] J. van Benthem. ‘one is a lonely number’: on the logic of communication. In Z. Chatzidakis, P. Koepke, and W. Pohlers, editors, Logic Colloquium ’02. ASL, Poughkeepsie, 2006. 20
  • [77] J. van Benthem. Dynamic logic of belief revision. Journal of Applied Non-Classical Logics, 17(2):129–155, 2007.
  • [78] J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, 2011.
  • [79] J. van Benthem. Logic in Games. MIT Press, 2013. To appear.
  • [80] J. van Benthem, J.D. Gerbrandy, T. Hoshi, and E. Pacuit. Merging frameworks for interaction. Journal of Philosophical Logic, 38:491–526, 2009.
  • [81] J. van Benthem, J. van Eijck, and B. Kooi. Logics of communication and change. Information and Computation, 204(11):1620–1662, 2006.
  • [82] W. van der Hoek and R. Verbrugge. Epistemic logic: a survey. In L.A. Petrosjan and V.V. Mazalov, editors, Game theory and Applications, volume 8, pages 53–94, 2002.
  • [83] H. van Ditmarsch, A. Herzig, and T. De Lima. From situation calculus to dynamic epistemic logic. Journal of Logic and Computation, 21(2):179–204, 2011.
  • [84] H. van Ditmarsch, W. van der Hoek, and P. Iliev. Everything is knowable – how to get to know whether a proposition is true. Theoria, 78(2):93–114, 2012.
  • [85] H. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic epistemic logic with assignment. In Proc. of 4th AAMAS, pages 141–148. ACM, 2005.
  • [86] H. P. van Ditmarsch. Knowledge games. PhD thesis, University of Groningen, 2000. ILLC Dissertation Series DS-2000-06.
  • [87] H. P. van Ditmarsch. Descriptions of game actions. J. Logic, Lang., Inform., 11:349–365, 2002.
  • [88] H. P. van Ditmarsch. Prolegomena to dynamic logic for belief revision. Synthese, 147:229– 275, 2005.
  • [89] H. P. van Ditmarsch and B. Kooi. The secret of my success. Synthese, 151(2):201–232, 2006.
  • [90] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Concurrent dynamic epistemic logic. In V. F. Hendricks, K. F. Jørgensen, and S. A. Pedersen, editors, Knowledge Contributors, pages 45–82. Kluwer, Dordrecht, 2003.
  • [91] H. P. van Ditmarsch, W. van der Hoek, and B. Kooi. Dynamic Epistemic Logic. Springer, Berlin, 2007.
  • [92] Peter Vanderschraaf and Giacomo Sillari. Common knowledge. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Fall 2007. 21
  • [93] F. Veltman. Defaults in update semantics. Journal of Philosophical Logic, 25:221–261, 1996. [94] Y. Wang, L. Kuppusamy, and J. van Eijck. Verifying epistemic protocols under common knowledge. In Proc. of 12th TARK, pages 257–266. ACM, 2009.

 

Author Information

Hans van Ditmarsch
Email: hans.van-ditmarsch@loria.fr
University of Lorraine
France

and

Wiebe van der Hoek
Email: wiebe@csc.liv.ac.uk
The University of Liverpool
United Kingdom

and

Barteld Kooi
Email: B.P.Kooi@rug.nl
University of Groningen
Netherlands

Classification

One of the main topics of scientific research is classification. Classification is the operation of distributing objects into classes or groups—which are, in general, less numerous than them. It has a long history that has developed during four periods: (1) Antiquity, where its lineaments may be found in the writings of Plato and Aristotle; (2) The Classical Age, with natural scientists from Linnaeus to Lavoisier; (3) The 19th century, with the growth of chemistry and information science; and (4) the 20th century, with the arrival of mathematical models and computer science. Since that time, and from an extensional viewpoint, mathematics, specifically, the theory of orders and the theory of graphs or hypergraphs, has facilitated the precise study of strong and weak forms of order in the world, and the computation of all the possible partitions, chains of partitions, covers, hypergraphs or systems of classes that we can construct on a domain. With the development of computer science, Artificial Intelligence, and new kinds of languages such as oriented-objected languages, an intensional approach has completed the previous one. Ancient discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph found some kind of revival via object-oriented modeling and programming, most of objected oriented languages being concerned with hierarchies, or partial orders: these structures reflect in fact the relations between classes in those languages, which generally admit single or multiple inheritance. In spite of these advances, most of classifications are still based on the evaluation of resemblances between objects that constitute the empirical data. This one is almost always computed by the means of some notion of distance and of some algorithms of aggregation of classes. So all these classifications remain, for technical and epistemological reasons that are detailed below, very unstable ones. A real algebra of classifications, which could explain their properties and the relations existing between them, is lacking. Though the aim of a general theory of classifications is surely a wishful thought, some recent conjecture gives the hope that the existence of a metaclassification (or classification of all classification schemes) is possible.

Table of Contents

  1. General Introduction: Classification Problems
  2. A Brief History of Classifications
    1. From Antiquity to the Renaissance
    2. From Classical Age to Victorian Taxonomy
    3. The Beginning of Modernity
  3. The Problem of Information Storage and Retrieval
  4. Ranganathan and the PMEST Scheme
  5. Order and Mathematical Models
    1. Extensional Structures
    2. A Glance at an Intensional Approach
  6. The Idea of a General Theory of Classifications
  7. References and Further Readings

1. General Introduction: Classification Problems

Classification problems are one of the basic topics of scientific research. For example, mathematics, physics, natural sciences, social sciences and, of course, library and information sciences all make use of taxonomies.  Classification is a very useful tool for ordering and organization. It has increased knowledge and helped to facilitate information retrieval.

Roughly speaking, ‘classification’ is the operation consisting of sharing, distributing or allocating objects in classes or groups which are, in general, less numerous than them. Commonly, classifications are defined on finite sets. However, if the objects are, for example, mathematical structures there can be infinite classifications. In this case, the previous requirement, of course, must be weakened: we may only want the (infinite) cardinal of the classification to be less than or equal to the (infinite) cardinal of the set of objects to be classified.  What we call ‘classification’ is also the result of this operation. We want, as much as it is possible, for this result be constant, namely, that the classification itself remains stable for a little transformation of data (of course, the sense of this requirement will have to become clearer). Various situations may happen: the classes may intersect or not, be finite or infinite, formal or fuzzy, hierarchically ordered or not, and so on.

The basic operation of grouping elements into classes, which simplifies the world, is a very powerful operation, but it also raises many questions. In particular, a number of philosophers, from Socrates to Diderot and even post-modern philosophers, criticized such an operation (see, for instance, Foucault 1967). Indeed, this operation has multiple profits.  First is the substitution of a rational and regular order in the chaotic and muddled multiplicities. Second is the reduction of the size of sets, so that, once we have constituted classes of equivalences, we can work with these classes and no more with the elements. Third, and finally, to make a partition of a set means locating in it a symmetry that  decreases the complexity of the problem and so simplifies the world. We can say with Dagognet (1984, 1990) than “less is more”: to compress the data really brings an intellectual gain.

Having outlined the main reasons for classifications, let us see how these classifications have developed and which forms they got throughout the course of time.

2. A Brief History of Classifications

The history of classifications (Dahlberg 1976) develops in four periods. From Plato and Aristotle to the 18th century, ancient classifications are hierarchical ones, they are finite and generally based on one single criterion. During the 18th century, some new classifications appear, which are multicriteria  – a domain can be co-divided in many ways, as Kant said in his Logic (see Kant 1988) – and indefinite or virtually infinite (Kant believed that we could endlessly subdivide the extension of a concept).  At the end of the 18th and at the beginning of the 19th century, with the chemical classifications of Lavoisier and then of Mendeleyev, one discovers combinatorial classifications or multiple crossed orders, like the chemical table of Elements, which correspond to a new concept of classification. In the 20th century, through the progress of mathematical order theory, factorial analysis of correspondence, and automatic classification, formal models begin to develop.

a. From Antiquity to the Renaissance

French commentator of Greek philosophers, R. Joly said that a typical trend of the Greek spirit was to reduce a multiple and complex reality into some categories which satisfy the reason, both by their restricted number and by the clear and precise sense that becomes attached to each of them. Indeed, Plato and Aristotle are among the great classifiers of these ancient times.

In all of Plato’s Dialogues, and especially in the latest ones (Parmenides, Sophist, Politicus, Philaebus), Plato obviously classified a lot of things (ways of life, political constitutions, pleasures, arts, jobs, kinds of knowledge, and so forth). Generally, for Plato, things were classified in relation with the distance that separates them from their archetypal forms, which yields some order (or pre-order) on them. Plato’s classifications are finite, hierarchical, dichotomous, and based on a single criterion. For example, in Gorgias (465c), a set of all practices is divided into two classes, the practices concerning the body and the practices concerning the soul, each of them being then divided into two others: gymnastics and medicine, on one hand, and legislation and justice, on the other hand. In the same way, in Republic (510a), the whole universe, viewed as the set of all real things, is divided into the visible world and the invisible world, each class being subdivided into images and objects or living beings on one hand, mathematical objects and ideas, on the other hand.

According to Plato, the rules of classifications are very simple. First, we have to make symmetric divisions in order to get well-balanced classes. For example, if we classify the peoples, we have to avoid setting the Greek in front of the other peoples, because one of the classes will be plethoric while the other one will have only one element (Politicus, 262a). Second, As a good cook who cuts an animal─this metaphor is in the Phaedrus−it is also necessary to choose the good joints or articulations. For example, in the field of numbers, it would be senseless to set 1000 in front of 999 other numbers. In contrast, the opposition even/odd or prime/not prime, is a real one. Thirds, in general, we must also avoid using negative determinations. For example, we have to avoid determinations like not-A because it is impossible that the non-being has sorts or species, these determinations block the development of thought.

Plato did not observe these wise rules, so incurring Aristotle’s criticisms. Against Plato’s theory, Aristotle argues that the method of division is not a powerful tool because it is non-conclusive. It does not make syllogisms (First Analytics, I, 31). In another text (Second Analytics, II,5), Aristotle insists on the contingency of the passage from a predicate to another one, that is, in the Platonic division, for every new attribute, we can wonder why it is such an attribute  oppose to another one. The differences introduced by dichotomies can be also purely negative and thus do not necessarily define a real being. Moreover, binary divisions presuppose that the number of the primitive species is a power of 2. In a division, a predicate can belong to different primitive species, for example “bipedalism” can apply to both birds and humans. But, according to Aristotle, the application of this term is not the same in both cases. Finally, the Platonic division confuses extensional and intensional views. It can identify the triangle, which is a kind, and one of its properties, for example, the equality of the sum of its angles in two right angles.

The previous questions get no answer in Plato’s theory. Aristotle rejected Plato’s method of division. But, Aristotle also rejected the Platonic doctrine of forms. According to Aristotle (Metaphysics, I, 9), Plato’s forms fail to explain how there could be permanence and order in the world. Far more, he argued, Plato’s theory of forms cannot explain anything at all in our material world. The properties that the forms have (according to Plato the forms are eternal, unchanging, transcendent, and so forth) are not compatible with material objects and the metaphor of participation or imitation breaks down in a number of cases. For instance, it is unclear what it mean for a white object to participate in, or to copy, the form of whiteness−that is, it is hard to understand the relationship between the form of whiteness and white objects themselves.

For all these reasons, Aristotle develops his own concepts, and his own logic of classifications. In the Topics (I, chap. 1), Aristotle introduces the notions of kind, species, property and a whole theory of basic predication that has subsequently developed in the work of Porphyry and Boece, respectively. This theory is based on the opposition between essence, all of the characters that define a thing, and accident, the qualities whose presence or absence does not modify the things essence. A commentator of the Aristotelian system, Porphyry (234-305), puts these distinctions to good use and tries to specify the hierarchy of the kinds and the species as defined by Aristotle. The famous Porphyrian Tree is the first abstract tree outlining these distinctions and illustrates the subordination existing between them (See Figure 1).

Fig. 1

Figure 1: The Porphyrius Tree

In a passage of his Commentary on Aristotle’s Categories (2014) Porphyry asked good questions at the origin of a hotly-debated controversy over whether or not universals were physical or immaterial substances. That is, a contention over whether universals are separated from sensible things or if they are involved in them, finding their consistency therein. In opposition to the traditional views (Platonic and Aristotelian or scholastic realisms), other solutions appeared. For example, Nominalism (Roscelin, 11th c.) claimed that universals are but words and that nothing corresponds to them in the Nature, which knows only the singular. Against that was Conceptualism (Abélard, 12th cn. and Ockham, 14th cn.), the view that kinds exist as predicates of subjects that, themselves, are real. In the last centuries of Middle Ages and in the Renaissance, we find also great scholars who work on classification. In particular, Francis Bacon (1561-1626), whose work on the classification of knowledge that has inspired the great librarians of the 19th century. But, the logic of classifications, which remains, in this time, the Aristotelian logic, receives practically no new development until the 18th century.

b. From Classical Age to Victorian Taxonomy

In the Classical Age, taxonomy as a fully-fledged discipline began to develop for several reasons. One important reason emerges from the birth of natural science and the need to organize floras and faunas in connection with the growth of the human population on Earth, in the context of the beginning of agronomy (Dagognet, 1970). In this period, naturalists like Tournefort (1656-1708), Linnaeus (1707-1778), De Jussieu (1748-1836), Desfontaines (1750-1833) and Cuvier (1769-1832) tried to classify plants and animals all around the world.

When classifying things or beings, you must get a criterion or an index, in order to make classes and separate varieties inside the classes. Indeed, all those naturalists differ on the criteria of their classifications. For example, concerning the classification of plants, Tournefort chose corolla, while Linnaeus chose the sexual organs of the plant. Concerning the animals, the classification of Cuvier violates Aristotle’s recommendations, by compositing vertebrates and invertebrates which, by chance, are something real. At the end of the century, Kant summarizes, in his Logic (1800), the main part of the knowledge about classifications in this period, by specifying the definitions of a certain number of terms and operations that the naturalists of the time empirically use. Kant was only interested in the forms of the classifications. In his Logic he defines a logical division of a concept as “the division of all the possible contained in it”. The rules of this division are the following: 1) members of the division are mutually exclusive, 2) their union restores the sphere of the divided concept, 3) each member of the division can be itself divided (the division of such divided members is a subdivision).  (1) and (2) seem to indicate that Kant was approaching our concept of a partition. But (3) shows that he does not have the concept of a chain of partitions, since he does not see that a subdivision of the same level forms one and the same partition.

These problems were also discussed, during the 19th century in Anglo-Saxon countries, even after Darwin’s theory of evolution. One may think that Darwin’s belief in branching evolution was based upon his familiarity with the taxonomy of his day, from which he was very aware. There were great taxonomists in England in the Victorian age and some of them−for instance, the paleontologist H. Alleyne Nicholson, a specialist of British Stromatoporoids−were prodigious and wrote monographs still in force today (Woodward 1903). At approximately the same time, H. Agassiz (Agassiz 1957), a scholar in classification theory, wrote about taxonomic concepts like categories, divisions, forms, homologies, analogies, and so on. Among different taxonomic systems mentioned in his Essay on Classification, include the classical systems of Leeuckart, Vogt, Linnaeus, Cuvier, Lamarck, de Blainville, Burmeister, Owen, Ehrenberg, Milne-Edwards, von Siebold, Stannius, Oken, Fitzinger, MacLeay, von Baer, van Bencden, and van der Hoeven. In The Origin of Species, Darwin himself said that it was a

truly wonderful fact…that all animals and all plants throughout all time and space should be related to each other in group subordinate to group, in the manner which we everywhere behold−namely, varieties of the same species most closely related together, species of the same genus less closely and unequally related together, forming sections and sub-genera, species of distinct genera much less closely related, and genera related in different degrees, forming sub-families, families, orders, subclasses, and classes. (1859, 128)

But what he called the “principle of divergence”–namely, the fact that during the modification of the descendants of any one species and during the incessant struggle of all species to increase in numbers, the more diversified these descendants become, the better will be their chance of succeeding in the battle of life−was illustrated by his famous tree-like diagram sketched in 1837 in the notebook in which he first posited evolution. From this time, tree-like structures, that has been also of great use in chemistry and would be formalized at the end of the century by the mathematician Arthur Cayley, tended to replace classifications.

c. The Beginning of Modernity

A new kind of classifications appeared at the end of the 18th century, with the development of Chemistry, namely, combinatorial classifications or cross multiple orders. This kind of classifications is either the crossing of two or more divisions, or the crossing of two or more hierarchies of divisions. In such a structure, as Granger (1967) said,  “elements are distributed according to two or several dimensions, giving rise to a multiplication table”. In a combinatorial classification, the elements themselves are not necessarily distributed into classes. Only the components of these elements are classified. For Granger, this model refers to the Cartesian plane and to the ordinal principle on which it is based. The Cartesian plane, results from a will of ordering a certain distribution of points in the space, by ordering points in every row and then by ordering the rows themselves. The virtue of multiple orders is to place what is classified in the intersection of a line and a column. So, as Dagognet (1969) has shown, when an element is absent or there is an empty compartment, it can be defined by its surroundings. This is what happened in the Mendeleyev table. This table has two main advantages. First, the table is creative, so the mass of a chemical element can be calculated from those which surround it (see Figure 2), and hence, chemical elements, which did not exist in Nature but were synthesized only 30 years later in laboratories, have already been accounted for by Mendeleyev. Second, the classification is not a purely spatial picture of the world. The temporality, in particular the future, is already present in it.

Fig. 2

Figure 2: The mass of an unknown element in the Mendeleyev Table

3. The Problem of Information Storage and Retrieval

At the end of the 19th century, the development of scientific research, which raised the question of information storage and retrieval, encouraged the constitution of voluminous librarian catalogues. This included the Dewey’s decimal classification, Otlet and La Fontaine’s universal decimal classification, and the Library of Congress classification. The aim of these kinds of classifications was to account for the whole of knowledge in the world. But, many problems arose from this attempt of library sciences to organize the whole knowledge. Three rules were commonly respected in more natural classifications: 1) Everything classified must appear in the catalogue (which must be, in principle, finite and complete), 2) there is no empty class, 3) nothing can belong to more than one class. Generally, these rules are not respected in library classifications. To face the extraordinary challenge of cataloguing knowledge growing indefinitely throughout the course of time, the big library classifications designed at the end of the 19th century adopted the principle of decimalization. This system was used because decimal numbers, used as numeral items, authorize indefinite extensions of classifications. Suppose you start with 10 main classes, from 0 to 9. If you add a zero to each number, you get the possibility of forming 100 classes (from 00 to 99) and if you go on, you can obtain 1000 classes (from 000 to 999). Then you can also put a comma or a point, and define items like: 150.234. After the point, the sequence of numbers is potentially infinite and you can go as far as is needed. Another difference is that library classifications can sometimes allow for vacant classes in their hierarchy, and also can, assume the inscription of classified subjects in several places. Vacant classes are used because a librarian must manage some place for new documents that are still temporarily unclassified. Multiple inscriptions are also used because readers, who sometimes do not know exactly what they are looking for, need to have a broad ranging accesses to knowledge. This made made way for the existence of entries like author, subject, time, place, and so forth. The previous requirement of decimalization is obvious in the Dewey Decimal Classification (DDC) proposed by Melvil Dewey in 1876 (Béthery 1982). This classification is made up of ten main classes or categories, each of them being divided into ten secondary classes or subcategories. These last ones contain in turn ten subdivisions. The partition of the ten main classes thus gives successively 100 divisions and 1000 sections.

DDC — main sections

  • 000 – Computer Science, Information and General Works
  • 100 – Philosophy and Psychology
  • 200 – Religion
  • 300 – Social Sciences
  • 400 – Language
  • 500 – Science (including Mathematics)
  • 600 – Technology and Applied Science
  • 700 – Arts and Recreation
  • 800 – Literature
  • 900 – History, Geography and Biography

In the same way, the Universal Decimal Classification (UDC) of Otlet and La Fontaine globally presents the same hierarchical organization, except in the fourth nodal class, which is left empty (thus, applying the previous principle of vacant classes).

As librarians have rapidly observed, one undesirable consequence of such decimal schemes is the increasing fragmentation of subjects as taxonomist’s work proceed. For example, the Dewey Classification, though having this useful advantage of being infinitely extendible, turns out rapidly to be a list or a nomenclature. This is also the case of the UDC of Otlet and La Fontaine, and of all the classifications of the same type. A first attempt to make up for such a disadvantage has consisted of allowing some junctions between categories in the classification. A second one is the possibility of using some tables (7 in the DDC) to aid in the search of a complex object, which may be located in different sites. For instance, a book of poetry, written by various poets from around the world, would appear in several classes, indexed thanks to the tables. In general, DDC used to combine elements from different parts of the structure, in order to construct a number representing the subject content. This one often combines 2 or more subject elements with linking numbers and geographical and temporal elements. The method consists of forming a new item rather than drawing upon a list containing each class and its meaning. For example, 330 (for Economics) + 9 (for Geographic Treatment) + 04 (for Europe) and the use of ‘/’ gives 330/94 (European Economy). Another example is the following: 973 (for United States) + 05 (division for periodicals) and the use of the point ‘.’ gives 973.05 (periodicals concerning the United States generally).

Other specific features occur in library classifications, which tend to make them very different from classical scientific taxonomies. One spectacular difference with hierarchical classifications in Zoology or Botany is, as we have already seen, that it is possible for subjects to appear in more than one class. For example, in DDC, a book on Mathematics could appear in the 372.7 section or in the 510 section, depending on if the book is a monograph instruction for teachers on how to teach mathematics, or a mathematics textbook for children. Another difference is a relative flexibility of library classifications.

Though there exist improvements, UDC and DDC, like most of the classifications constructed at the same time (see Bliss 1929) are based on a perception of knowledge and of the relationships between academic disciplines extant from 1890 to 1910. Moreover, though updated regularly, UDC and DDC, as decimal systems, are less hospitable to the addition of new subjects. These kinds of classification are based on fixed and historically dated categories. One may observe, for example, that none of the main concepts of our present library science (digital library, knowledge organization, automatic indexing, information retrieval, and so forth) were included in the index of the 2005 UDC edition, and that technical taxonomies generally require more complex features (Dobrowolski 1964).

4. Ranganathan and the PMEST Scheme

There have been many pursuits to solve the aforementioned librarian problems. Some of them are well known since the middle of the 20th century. In the course of the 20th century, new modes of indexing and original classification schedules appeared in library science with the Indian librarian Shiyali Ramamrita Ranganathan (1933, 2006) and his faceted classification – also called “Colon classification” (CC), because of its use of the colon to indicate relations between subjects in the former edition.

Ranganathan was at first a mathematician and knew little about the library. But he took charge of the Madras University Library, and was then deputed by his University to study Library Science in London. There, he attended the School of Librarianship in the University College and discovered, as he said later, the “charm of classifications”, and also its problems. He saw very quickly that Decimal Classifications did not give satisfaction to users. On the opposite, he had the vision of a meccano set, where, instead of having ready-made rigid toys, one can construct them with a few fundamental components. This made him think of a new kind of classification.

It appeared to Ranganathan that the new theory might be organized at the higher level in 5 fundamental categories (FC) called facets: Personality, Matter, Energy, Space and Time−in summary PMEST. In each isolate facet  a Compound Subject is deemed to be a manifestation of one (and only one) of one or other of the five fundamental categories. There is also subfacets, so that the facet scheme PMEST and the subfacets we may form from it, are then used to sort subclasses in the main classes of the classification.

The difference with previous classifications is in the way one defines ‘subfacets’. Rather than simply dividing the main classes into a series of subordinate classes, one subdivides each main class by particular characteristics into facets. Facets, labeled by Arabic numbers, are then combined to make subordinate classes as needed. For example, Literature may be divided by the characteristic of language into the facet of Language, including English, German, and French. It may also be divided by form, which yields the facet of Form, including poetry, drama and fiction. So CC contains both basic subjects and their facets, which contain isolates. A basic subject stands alone, for example: Literature in the subject English Literature, while an isolate, in contrast, is a term that modifies a basic subject, for example, the term ‘English’. Every isolate in every facet must be a manifestation of one of the five fundamental categories in the PMEST scheme.

The advantages of the CC are numerous. The first one is a greater flexibility in determining new subjects and subject numbers. A second is the concept of phases, which allows taxonomists to readily combine most of the main classes in a subject.  Consider for example a subject like Mathematics for biologist. In this case, single class number enumerative systems, as those predominating in US libraries, tend to force classifiers to choose either Mathematics or Biology as the main subject. However, CC supplies a specific notation to indicate this be-phased condition.

Indeed, some problems remain unsolved. In CC, facets, that is, small components of larger entities or units are similar to flat faces of a diamond which reflect the underlying symmetry of the crystal structure, so that the general structure of Ranganathan Classification, as that of a faceted classification in general, is a kind of permutohedron. In principle, all descriptions may be done, whatever the order of them. For example, if we have to classify a paper speaking about seasonal variations of the concentration of noradrenaline in the tissue of the rat, we must get the same access if we have the direct sequence: (1) Seasonal, variations, concentration, noradrenaline, tissue, rat, or the reversed one: (2) Rat, tissue, noradrenaline, concentration, variations, seasonal. In mathematical words, this means clearly that the underlying structure that makes this transformation possible must be a commutative group. But this is not always the case, and for some dihedral groups, this structure is even forbidden. Another potential worry is that the PMEST scheme, which certainly has some connections with Indian thought, is far from being universally accepted (see De Grolier 1962) and has not been very often implemented in libraries, even in India.

So, in spite of all the improvements they receive in the course of time, a lot of problems have been raised in front of library classifications. In particular, library classifications will be strongly questioned in the 20th century by the proliferating development of the knowledge. First, the ceaseless flux of new documents forbids a stiff topology for classifications. The problem, then, is to know how to construct evolutionary structures. Second, the successive orderings of the knowledge (groupings and revisions and not only ramifications) has called relational powerful and automated documentary languages. Classifications still remain necessary, because documentary languages cannot do everything. So the problem is still open. But, with the big development of mathematics in the last century, this general problem, which is the great problem of order, has to be investigated by the means of mathematical structures.

5. Order and Mathematical Models

First attempts to study orders in mathematics began to develop at the end of the 19th century with Peano, Dedekind and Cantor (especially with his theory of ordinals, which are linear ordered sets).  They go on with Peirce (1880) and Shröder (1890) and their works around the question of an algebra of logic. Then, in the first part of the 20th century, comes the notion of partial order with an article of MacNeille (1937) and the famous work of G. Birkhoff (1967) who introduced the notion of lattice, algebraically developed later in the great book of Rasiowa and Sikorski (1970). During the same period, mathematical models of hierarchical classifications, which have been investigated in the USA by Sokal and Sneath (1963, 1973) or, in England, by Jardine and Sibson (1971) were developed in France in the works of Barbut and Monjardet (1970), Lerman (1970, 1981), and Benzécri (1973). All these works supposed the big last century advances in mathematical order theory: especially the papers of Birkhoff (1935), Dubreil-Jacotin (1939), Ore (1942, 1943), Krasner (1953-1954) and Riordan (1958). The Belgian logician Leo Apostel (1963) and the Polish mathematicians Luszczewska-Romahnowa and Batog (1965a, 1965b) have also published important articles on the subject. The more and more important use of computers in the search of automatic classifications has also been, in those years, a reason for searchers to get interested in mathematical models.

As there are many forms of classifications in the world of knowledge (we can find them, as we have seen, in mathematics, natural sciences, library and information science, and so forth) there are also many possible mathematical models for classifications. We begin with the study of extensional structures.

a. Extensional Structures

In order to clarify the situation, we start with the weakest form of them and move to stronger forms. Mathematics allows us to begin with very few axioms, that usually define weak general structures, and afterwards, by adding new conditions, one can get other properties and stronger models. In our case, the weakest structure is just a hypergraph H = (X,P) in the sense of Berge (1970), with X a set of vertices and P a set of nonempty subsets called edges (See Figure 3).

Fig. 3

Figure 3: A Hypergraph

In this case, the set of edges P does not necessarily cover the set X, and some nodes (vertex of degree zero), may have no link to some edge. Assume the following conditions:

(C0)    X ∈ P,

(C1)    For all x ∈ P, {x} ∈ P,

Accordingly, we have a system of classes (in the sense of Brucker-Barthélemy 2007).

Add now the following new conditions: for every Pi ∈ P:

(C2)      Pi ∩ Pj = Ø,

(C3)      ∪ Pi = X,

Then P is a partition of X and the Pi are the blocks of the partition P.

Let now P(X) be the set of partitions on a nonempty finite set X. We may define on P(X) a partial order relation ≤ (reflexive, antisymmetric and transitive) such that P(X), ≤) is a lattice in the sense of Birkhoff (1967), that is, a partial order where every pair of elements has the same least upper bound and the same greatest lower bound. Then, one can prove that all the chains (all the linearly ordered sequences of partitions) of this lattice are equivalent to hierarchical classifications. So, the set C(X) of all these chains is exactly the set of all hierarchical classifications on a set. This set C(X) has itself a mathematical structure: it is a semilattice for set intersection. This model allows us to get all the possible partitions of P(X) and all the possible chains of C(X) (See Figure 4).

Fig. 4

Figure 4: The lattice of partitions of a 4-element set.

A first problem is that such partitions are very numerous. For |X| = 9, for example, there is already 21147 partitions. So, when we want to classify some domain of objects (plants, animals, books, and so forth), it is not very easy to examine what classification is the best one among, say, several thousands of them.

A second problem is that the world is not made of chains of partitions. If it were, of course, the game would be over. Everything could be inserted in some hierarchical classification. But, the real world has no reason to present itself as a hierarchical classification. In the real world, we have generally to deal with quite chaotic entities, complicated fuzzy classes and poor structured objects, all that form what we can call ‘rough data’. So when we want to get a clear order, we have to construct it,  such that it is extracted from the complicated data. For that, we have to compare objects, to know the degree to which they are similar, and to do so, we need of course a notion of ‘similarity’. In order to make empirical classifications we must evaluate the similarities or dissimilarities between elements to be classified. In the history of taxonomic science, Buffon (1749) and Adanson (1757) have tried to understand the meaning of this evaluation in the following way. First, they claim, we have to measure the distance between the objects by the means of some index, so that we can build classes. Afterwards, we have to measure the distance between classes themselves, so that we can group some classes into classes of classes, and so replace the initial set of objects with an ordered set of classes that is less numerous than them.

What old taxonomists were doing, only basis of observation, can now be carried out with the help of mathematics, using a modern notion of distance. Lerman (1970) and Benzécri (1973) showed that a hierarchical classification, that is, a chain of partitions, is nothing but a particular kind of distance or, a particular kind of dissimilarity (Van Cutsem 1994). It is an ultrametric distance, which gives tree representations (Barthélemy and Guénoche 1988) and also has the special property to correspond exactly with the chain, so that, when considering all the chains, the set of their corresponding distance matrices makes a semiring (R, +, ×) when we interpret the lattice operations min and max in an anusual but clever manner (+ for min, × for max) (Gondran 1976). Problems arise when the distance between the objects classified is not ultrametric. In such cases, we have to choose the closest ultrametric smaller than the given distance, and so, access to the best hierarchical classification we can get and which is the closest one to the data. However, this kind of approach leads, in general, to relatively unstable classifications.

Indeed, there are two kinds of instability for classifications. The first, Intrinsic instability,,is associated to the plurality of methods (distances, algorithms and so forth) that can be used to make the classifications of objects. The second is extrinsic instability, which is connected to the fact that our knowledge is changing with time, so the definitions of objects (or attributes of the objects) are evolving.

An answer to the question of intrinsic instability is a theorem of Lerman (1970) which says that if the number of attributes (or properties) possessed by the objects of a set X is constant, the associated quasi-order given by any natural metric is the same. But this result has two limits. First, when the sample variance of the number of attributes is a big one, of course, the stability is lost and second, if we classify the attributes, instead of classifying the objects, the reverse is not true.

For extrinsic instability the answers are more difficult to find. We may appeal to methods used in library decimal classifications (UDC, Dewey, and so forth), which make possible infinite ramified extensions, but these classifications, as we have seen, are apt to assume that higher levels are invariant and have also the disadvantage to be enumerative and to degenerate rapidly into simple lists. Also, pseudo-complemented structures (Hilman 1964) that admit some kinds of waiting boxes (or compartments) for indexing things that are not yet classified. We get as well structures whose transformations obey certain rules that have been fixed in advance. That is the case of Hopcroft 3-2 trees (Aho, Hopcroft, Ulmann 1983) for instance, or of structures close to these ones (Larson and Walden, 1979). In recent years, new models for making classifications came from conceptual formal analysis (Barwise and Seligman, 2003), computer science or views using non-classical logics in the domain of formal ontologies (Smith 1997, 2003). In computer science, for example, the concept of Abstract Data Type (ADT), related to the concept of Data Abstraction, important in object-oriented programming, may be viewed as a generalization of mathematical structures. An ADT is a mathematical model for data types, where a data type is defined by its behavior from the point of view of a user of the data. More formally, an ADT may be defined as a “class of objects whose logical behavior is defined by a set of values and a set of operations” (Dale-Walker 1996), which is strictly analogous to algebraic structures in mathematics. So, if we are not satisfied by a rough classification like the partition into collections, streams and iterators (support loops accessing data items) and relational data structures that capture relationships between data items, we must admit that ADT can also be regarded as a generalized approach of a number of algebraic structures, such as lattices, groups, and rings (Lidi 2004). Hence, classifications of ADT turn into classifications in algebraic specifications of ADT (Veglioni 1996). In this context, computer science adds nothing to mathematics and the problem is now that a classification of mathematical structures using, for instance, Category theory, as Pierce (1970) tried does not bring a sufficient answer because a category may exist while its objects are not necessarily constructible (Parrochia-Neuville 2013).

So, none of the previous approaches is very convincing for solving the basic problem, which always remains the same. We are lacking a general theory of classifications, which would only be able to study and, in the best case, solve some the main problems of classification.

b. A Glance at an Intensional Approach

Instead of making partitions by dividing a set of entities, so that the classes obtained in this way are extensional classes, as we saw in the previous section, we can instead proceed by associating a description to a set of entities. In this case, the classes are called intensional classes. Aristotle himself mixed the two points of view in his logic but Leibniz was the first to propose a purely intensional interpretation of classes. For a long time, that view was a minority and has never won unanimous support among the Ancient philosophers and logicians (as the numerous discussions between Aristotle and Plato, Ramus and Pascal, Jevons and Joseph demonstrate). However, the development of computer science brought this view back, since for declarative languages and particularly object-oriented languages, pure extensional classes or sets are rather uncommon. In this approach, the intension can be given either a priori, for example by a human actor from his knowledge of the domain, or a posteriori, when it is deduced from the analysis of a set of objects. In object-oriented modeling and programming, classes are traditionally defined a priori, with their extension mostly derived at running stage. This is usually done manually (intension being represented by logical predicates or tags), but techniques for a posteriori class discovery and organization also exist. In the context of programming languages, they deal with local class hierarchy modification by adding interclasses and use similarity-based clustering techniques or the Galois lattice approach (Wille 1996).

When there is an unrelated collection of sets, which is the case in artifact-based software classification, an issue is to compare and organize these sets simply by inclusion, or to apply conceptual clustering techniques. However, most of objected oriented languages are concerned with hierarchies, whose structure may be a tree, a lattice, or any partial order. The reason is that such structures reflect the variety of languages, some of them admitting multiple inheritance (C++, Eiffel), others only single inheritance (Smalltalk). Java has a special policy concerning this point: it admits two kinds of concepts, classes and interfaces, with single inheritance for classes and multiple inheritance for interfaces.

The viewpoint of Aristotle was the following: the division must be exhaustive, with parts mutually exclusive, and an indirect consequence of Aristotle’s principles is that only leaves of the hierarchy should have instances. Furthermore, the divisions must be based on a common concern whose modern name is the ‘discriminator’ in Unified Modeling Language (UML). But usual programming practices do not necessarily satisfy those principles. Multiple inheritance, for example, is contradictory with the assumption of mutually exclusive parts, and instances may in general be directly created from all (non-abstract) classes. Direct subclasses of a class can be derived according to different needs with different discriminators, but there is no evidence that this approach leads to relevant classifications. Objected oriented approaches, which transgress Aristotelian principles, are almost always practical storage modes but do not satisfy the main requisites of good classifications.

There are main principles that yield good classification, which are described in the intensional perspective. First–with Apostel [1963]– are some basic definitions.

From an intensional viewpoint, a division (or partition) is a closed formula F, which contains some assertion of the type (P ⊃ (Q1 ∨ Q2 ∨…∨ Qn)). So, a classification is a sequence of implicative-disjunctive propositions which takes the following form: everything which has the property P has also one of the n properties Q1 … Qn. Everything which has the property Qr  has also the property S, and so on (Apostel 1963, 188).

A division is essential if the individuals having the property P – and only this individual – may also have one of the properties Qi. So, we can see that there are degrees in essentiality insofar as the number of individuals having the Q’s without having the P’s is greater or less. At every level, a classification may be probably or necessarily essential or exhaustive, or exclusive.

We call intensional weight w(P) of a property P,  the set of disjunctions implied by this property (with necessity, factuality or probability). Properties defining classes in the same level may have extremely variable intensional weights. The basis of a division is the constant relation R, if any, between the properties of two different classes of this division.

A basis of division is (partially or totally) exhausted in some level insofar as, for this level, we do not find, in any case, true disjunctive propositions that are implied by the properties of this level and whose terms are connected by this very relation R.

A division is said to follow another one immediately (or to be immediately subsequent) if, for all P properties of the first, and for all Q properties of the second that are disjunctively implied by the P’s, there exists no sequence of R properties disjunctively implied by the P’s and disjunctively implying the Q’s.

The form of a property defining a class is the logical form of this property (conjunction of properties, disjunction of properties, negation of properties, single property).

For Apostel, an optimal classification should satisfy the following requisites:

  1. Every level needs a basis for division;
  2. No new basis for division shall be introduced before the previous one is exhausted;
  3. Every division is essential;
  4. Intensional weights of classes in a given level are comparable and relations between intensional weights of subsequent division properties in the classification must be constant.
  5. Properties used to define classes are conjunctive ones, and not negative ones.
  6. From the intensional viewpoint, divisions must be immediately subsequent.

In real domains, these requirements, or some of them, fail to hold. Levels are often extensionally equivalent but intensionally, the basis of division, the intensional weight, and so forth may change or not.

A natural classification is such that the definition of the domain classified determines in one and the same way the choice of the criteria of classification. It means that the fundamental set may be divided such that the division in the first level of the classification is an essential and subsequent one.

Intensional and extensional classifications are intimately related. Gathering entities in sets to produce extensional classes implies tagging these entities by their membership to these classes. But, intensional classes, built according to these descriptions, have an extension, which may be different from the initial extensional classes. So, in fact, both perspectives are not totally isomorphic and from Peirce (Hulwitt 1997) to Quine (1969), and presently, the question of natural classes remains an open and somewhat controversial question.

6. The Idea of a General Theory of Classifications

The idea of a general theory of classifications is not new. Such a project has been anticipated by Kant’s logic at the end of the 18th century. Then it was followed by many attempts to classify sciences at the beginning of the 19th century (Kedrov 1977) and had been posed by Auguste Comte in his Cours de philosophie positive (Comte 1975) as a general theory based on the study of symmetries in nature. Comte was inspired by mathematician Gaspard Monge and his classification of surfaces in geometry. However, this remains, in the work of Comte, a wishful thought. In the same way, the French naturalist Augustin-Pyramus de Candolle, published in 1813 an Elementary Theory of Botany, a book in which he introduced the term ‘taxonomia’, used in this work for the first time (de Candolle 1813). De Candolle showed that Botany had to leave artificial methods for natural ones, in order to get a method independent from the nature of the objects. Unfortunately, nothing very concrete or precise followed his remarks. Moreover, the previous projects were only concerned with finite classifications, particularly, biological ones. A higher and more general view came into light around the 1960s with the Belgian logician Leo Apostel. Apostel (1963) wanted to write a concrete version of Set theory, and, in order to do that, needed axioms that allow him to include in the theory only the classes actually existing in the world. As such, Apostel was led to ask some questions about the well-known axioms of Zermelo-Fraenkel’s Set theory. He did not reject the whole ZF-axiomatics but however suspected axioms like the pairing axiom, the axiom of separation and the power set axiom. He also left optional the axiom of infinity and had rather a negative opinion about the axiom of choice. This project got a new revival with the recent book of Parrochia-Neuville (2013).

The hardships of solving the problem of instability of classifications provided motivation for a search for some clear composition laws to be defined on the set of classifications over a set and to a true algebra of classifications, if possible, which is very difficult because this algebra would have to be, in principle, commutative and non-associative. This search is all the more crucial that a recent theorem proved by Kleinberg (2002) shows that one cannot hope to find a classifying function which would be together scale invariant, rich enough and consistent. This result means that we cannot find empirical stable classifications by using traditional clustering methods.

In the past, some attempts have been made to formalize non-commutative parenthesized products: Comtet (1970) and Neuville, in the 1980s used the Lukasiewicz’s Reverse Polish Notation (RPN), named also Postfix Notation, whose advantage is not only to make brackets or parentheses superfluous, but also to perform calculations on trees in the required order. But, a general algebra of classifications on a set is not known, even if some new models−Loday’s dendriform algebras, for example, which work very well for trees (See Dzhumadil’daev-Löfwall 2002)−are good candidates. In any event, we are invited to look for it, for two reasons. First, the world is not completely chaotic and our knowledge is evolving according to some laws. Second, there exist quasi-invariant classifications in physics (elementary particle classification), chemistry (Mendeleyev table of Elements), crystallography (the 232 groups of crystallographic structures) among others. Most of these good classifications are based on some mathematical structures (Lie groups, discrete groups, and so forth.). To address questions concerning classification theory, and clarify the different domains of it, one may propose this final view (See Figure 6):

  • When our mathematical tools apply only to sense data, we get phenomenal classifications (by clustering methods): these are generally quite unstable.
  • When our mathematical tools deal with crystallographic or quantum structures, we get what we call, using a Kantian concept, noumenal classifications (for instance, by invariance of discrete groups or Lie Groups). These are generally more stables.
  • When we search a general theory of classifications (including infinite ones), we are in the domain of pure mathematics. In this field, ordering and articulating the infinite set of classifications comes to construct the continuum.

Figure 6

Figure 6: Metaclassification

This problem is far from being solved because there are a lot of unstable theories (Shelah 1978, 1998). However, the recent work of Parrochia-Neuville (2013) assumes the conjecture that a metaclassification, that is, a classification of all mathematical schemes of classifications, does exist. The reason is that all these forms may be expressed as ellipsoids of an n-dimensional space (Jambu 1983) that must converge necessarily on a point, the index of the classification. If the real proof comes, this will give a theorem of existence of such a structure from which a number of important results could follow.

7. References and Further Readings

  • Adanson, M. 1757. Histoire naturelle du Sénégal. Paris: Claude-Jean-Baptiste Bauche.
  • Aho, A.V., Hopcroft, J.E, Ulmann, J.D. 1983. Data Structures and algorithms. Reading (Mass.): Addison-Wesley Publishing Company.
  • Agassiz, L. 1962. Essay on Classification (1857), reprint. Cambridge: Harvard University Press.
  • Apostel, L. 1963. Le problème formel des classifications empiriques. La Classification dans les Sciences. Gembloux: Duculot.
  • Aristotle, 1984. The Complete Works. Princeton: Princeton University Press.
  • Barbut M., Monjardet, B. 1970. Ordre et classifications, 2 vol. Paris: Hachette.
  • Barthélemy, J.-P., A. Guénoche. 1988. Les arbres et les représentations des proximités. Paris: Masson.
  • Barwise, J., Seligman, J. 2003. The logic of distributed systems. Cambridge: Cambridge University Press.
  • Béthery, A. 1982. Abrégé de la classification décimale de Dewey. Paris: Cercle de la librairie.
  • Bliss, H. E. 1929. The organization of knowledge and the system of the sciences. New York: H. Holt and Company.
  • Benzécri, J.-P., et alii. 1973. L’analyse des données, 1, La taxinomie, 2 Correspondances. Paris: Dunod.
  • Birkhoff, G. 1935. On the structure of abstract algebras. Proc. Camb. Philos. Soc. 31, 433-454.
  • Birkhoff, G. 1967. Lattice theory (1940), 3rd ed. Providence: A.M.S.
  • Brucker F., Barthélemy, J.-P. 2007. Eléments de Classification, aspects combinatoires et algorithmiques. Paris: Hermès-Lavoisier.
  • Buffon, G. L. Leclerc de, 1749. Histoire naturelle générale et particulière (vol. 1). Paris: Imprimerie royale.
  • Candolle (de), A. P. 1813. Théorie élémentaire de la Botanique ou exposition des principes de la classification naturelle et de l’art d’écrire et d’étudier les végétaux, first edition. Paris: Deterville.
  • Comte, A. 1975. Philosophie Première, Cours de Philosophie Positive (1830), Leçons 1-45. Paris: Hermann.
  • Comtet, L. 1970. Analyse combinatoire. Paris: P.U.F..
  • Dagognet, F. 2002. Tableaux et Langages de la Chimie (1967). Seyssel: Champ Vallon.
  • Dagognet, F. 1970. Le Catalogue de la Vie. Paris: P.U.F..
  • Dagognet, F. 1984. Le Nombre et le lieu. Paris: Vrin.
  • Dagognet, F. 1990. Corps réfléchis. Paris: Odile Jacob.
  • Dahlberg, I., 1976. Classification theory, yesterday and today. International Classification 3 n°2, pp. 85-90.
  • Dale, N., Walker, H. M. 1996. Abstract Data Types: Specifications, Implementations, and Applications. Lexington, Massachusetts: D.C. Heath and Company.
  • Darwin, C.R., 1964. On the Origin of Species (1859), reprint. Cambridge: Harvard University Press.
  • De Grolier, E. 1962. Etude sur les catégories générales applicables aux classifications documentaires, Unesco.
  • Dobrowolski, Z. 1964. Etude sur la construction des systèmes de classification. Paris, Gauthier-Villars.
  • Dubreil, P., Jacotin, M.-L. 1939. Théorie algébrique des relations d’équivalence. J. Math. 18, pp. 63-95.
  • Dzhumadil’daev,A. et Löfwall, C. 2002. Trees, free right-symmetric algebras, free Novikov Algebras and Identities. Homology, homotopy and Applications, vol.(4(2), pp. 165-190.
  • Foucault, M. 1967. Les Mots et les Choses. Paris: Gallimard.
  • Gondran, M. 1976. La structure algébrique des classifications hiérarchiques. Annales de l’Insee, pp. 22-23.
  • Granger, G.-G. 1980. Pensée formelle et Science de l’Homme (1967). Paris: Aubier-Montaigne.
  • Hilman, D.J. 1965. Mathematical classification technics for non static document collections, with particular reference to the problem of revelance. Classification Research, Elsinore Conference Proceedings, Munksgaard, Copenhagen, pp. 177-209.
  • Huchard, M., R. Godin, , A. Napoli, A. 2003. Objects and Classification. ECOOP 2000 Workshop reader, J. Malenfant, S. Moisan, A. Moreira (Eds), LNCS 1964. Berlin-Heidelberg-New York: Springer-Verlag, pp 123-137.
  • Hulswit, M. 1997. Peirce’s Teleological Approach to Natural Classes. Transactions of the Charles S. Peirce Society, pp. 722-772.
  • Jambu, M. 1983. Classification automatique pour l’analyse des données, 2 vol.. Paris: Dunod.
  • Jardine N., Sibson, R. 1971. Numerical Taxonomy. New York: Wiley.
  • Joly, R. 1956. Le thème philosophique des genres de vie dans l’Antiquité grecque. Bruxelles: Mémoires de l’Académie royale de Belgique, classe des Lettres et des Sciences mor. et pol., tome Ll, fasc. 3.
  • Kant, E. 1988. Logic. New York: Dover Publications.
  • Kedrov, B. 1977. La Classification des Sciences (vol. 2). Moscou: Editions du Progrès.
  • Kleinberg, J. 2002. An impossibility theorem for Clustering. Advances in Neural Information Processing Systems (NIPS), 15, pp. 463-470.
  • Krastner M. 1953-1954. Espaces ultramétriques et ultramatroïdes. Paris: Séminaire, Faculté des Sciences de Paris.
  • Larson, J.A., Walden, W.E. 1979. Comparing insertion shemes used to update 3-2 trees. Information Systems, vol.4, pp. 127-136.
  • Lerman, I.C. 1970. Les bases de la classification automatique. Paris: Gauthier-Villars.
  • Lerman, I.C. 1981. Classification et analyse ordinale des données. Paris: Dunod.
  • Lidi R., 2004. Abstract Algebra. Berlin-Heidelberg-New York: Springer-Verlag.
  • Luszczewska-Romahnowa S., Batog T. 1965a. A generalized classification theory I. Stud. Log., tom XVI, pp. 53-70.
  • Luszczewska-Romahnowa S., Batog T. 1965b. A generalized classification theory II. Stud. Log., tom XVII, pp. 7-30.
  • MacNeille 1937. Partially ordered sets. Transaction Amer. Math. Soc., vol. 42, pp. 416-460.
  • Ore O. 1942. Theory of equivalence relations. Duke Math. J. 9, pp. 573-627.
  • Ore O. 1943. Some studies on closer relations. Duke Math. J. 10, pp. 761-785.
  • Parrochia, D., Neuville, P. 2013. Towards a general theory of classifications. Bäsel: Birkhaüser.
  • Peirce C. S. 1880. On the Algebra of Logic. American Journal of Mathematics 3, pp. 15-57.
  • Pierce, R.S. 1970. Classification problems. Mathematical System theory, vol. 4, n°1, March, pp. 65-80.
  • Plato, 1997. The Complete Works. Cambridge: Hacking publishing Company
  • Porphyry, 2014. On Aristotle’s Categories. London, New York: Bloomsbury Publishing Plc.
  • Quine, W.V.O. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
  • Ranganathan, S. R. 1933. Colon Classification. Madras: Madras Library Association.
  • Ranganathan, S. R. 2006. Prolegomena to Library Classification (1937), Reprint. New Delhi: Ess Pub..
  • Rasiowa H., Sikorski, R. 1970. The Mathematics of Metamathematics. Cracovia: Drukarnia Uniwersytetu Jagiellonskiego.
  • Riordan, J. 1958. Introduction to combinatorial analysis. New York: Wiley.
  • Roux, M. 1985. Algorithmes de classification. Paris: Masson.
  • Shelah, S. 1988. Classification Theory (1978). Amsterdam: North Holland.
  • Shröder, E. 1890. Vier Kombinatorische Probleme. Z. Math. Phys. 15, pp. 361-376.
  • Smith, B. 1997. Boundaries: An Essay in Mereotopology. L. Hahn (ed.), The Philosophy of Roderick Chisholm. La Salle, Open Court: Library of Living Philosophers, pp. 534-561.
  • Smith, B. 2003. Groups, sets and wholes. Revista di estetica, NS (P. Bozzi Festschrift), 24-3, 1209-130.
  • Sokal R. R., Sneath, P.H. 1963. Principle of numerical taxonomy. San Francisco: W. H. Freeman.
  • Sokal, R. R., and Sneath, P. H. 1973. Numerical Taxonomy, the principles and practice of numerical classifications. San Francisco: W. H. Freeman.
  • Van Cutsem B. (ed.) 1994. Classification and dissimilarity analysis. New York-Berlin-Heidelberg: Springer Verlag.
  • Veglioni, S. 1996. Classifications in Algebraic specifications of Abstract Data Types. CiteSeerX
  • Windsor, M. P. 2009. Taxonomy was the foundation of Darwin’s evolution. Taxon 58, 1, pp. 43-49.
  • Wille, R. 1996. Restructuring lattice theory: an approach based on hierarchy of concepts. Rival, I (ed.) Ordered Sets. Boston: Reidel, pp. 445-470.
  • Woodward, H. 1903. Memorial to Henry Alleyne Nicholson. M.D., D.Sc., F.R.S. Geological Magazine, 10, pp. 451-452.

 

Author Information

Daniel Parrochia
Email: daniel.parrochia@wanadoo.fr
Université Jean Moulin – Lyon III
France

The Aim of Belief

It is often said that belief has an aim. This aim has been traditionally identified with truth and, since the late 1990s, with knowledge. With this claim, philosophers designate a feature of belief according to which believing a proposition carries with it some sort of commitment or teleological directedness toward the truth (or knowledge) of that proposition. This feature is taken to be constitutive of belief (that is, it is part of what a belief is that it is an attitude having this aim) and individuative of that type of mental state (that is, it is sufficient for distinguishing beliefs from other types of mental attitude like desire and imagining). Philosophers appeal to belief’s aim mainly for explanatory purposes: the aim is supposed to explain a number of other features of belief, such as the impossibility of believing at will, the infelicity of asserting Moorean sentences (for example, “I believe that it is raining, but it is not raining”), and the normative force of evidential considerations in the processes of belief-formation and revision.

Though many tend to agree on the above aspects of the aim, there are major disagreements over two further issues: (1) how to interpret the claim that belief has an aim, and (2) what this aim is. With respect to (1), the claim has received very different interpretations. Some have interpreted it literally, taking the aim as an intentional purpose of believers or a functional goal of beliefs; others have interpreted it metaphorically, as some kind of commitment or norm governing beliefs and their regulation (formation, maintenance, and revision); still others deny that beliefs aim at truth in a substantive sense and endorse minimalist accounts of belief’s truth-directedness. With respect to (2), there is an ongoing debate on whether the aim of belief is truth, knowledge, or some other condition such as epistemic justification.

Table of Contents

  1. The Truth-Directedness of Belief
    1. The Aim as Constitutive and Individuative of Belief
    2. Differences between the Aim and Other Properties of Belief
    3. The Explanatory Role of the Aim
  2. Interpretations of the Aim
    1. Teleological Interpretations
    2. Normative Interpretations
    3. Minimalist Interpretations
  3. What Does Belief Aim At?
  4. Relevance of the Topic
  5. References and Further Reading

1. The Truth-Directedness of Belief

The claim that “belief aims at truth” was first coined by Bernard Williams (1973) to designate a set of properties of beliefs, namely (1) that truth and falsehood are dimensions of assessment of beliefs as opposed to other psychological states and dispositions; (2) that to believe that p is to believe that p is true; and (3) that to say “I believe that p” carries, in general, a claim that p is true; that is, it is a qualified way of asserting that p is true (Williams, 1973, p. 137).

Since Williams, many have taken up the claim that belief aims at truth. However, with such an expression, these philosophers do not refer to a set of properties as Williams did, but to a unique feature of belief (sometimes also called truth-directedness). This feature (that is, aiming at truth) is supposed to capture the specific relation of belief with truth. This relation seems to be peculiar to belief, and to play an important role in the characterization of this type of attitude. No other attitude seems to entertain such a special relation with truth. Like belief, the content of attitudes like (propositional) desires, imaginings, and mere thoughts can be true or false. But differently from these attitudes, beliefs are considered defective if their content is false, or correct if it is true: if I imagine that snow is black, there is nothing defective in my imagination; but if I believe that snow is black, there is something wrong with my belief. Also, we can arbitrarily decide to form or revise attitudes like imagining and assuming regardless of whether we take their contents to be true or false, but this seems not to be possible for beliefs. In short, these attitudes are not sensitive to truth-regarding considerations in the way beliefs are (in both normative and descriptive ways). The relation of belief with truth also differs from that of factive attitudes like knowledge and regret. Differently from beliefs, these attitudes imply the truth of their content. If I know that it is raining now in Paris, then it is true that it is raining now in Paris. But if I believe that, the content of my belief may be false. The relation of belief to truth is thus neither as weak as that of other attitudes like imagining, nor as strong as that of knowledge. This is why it is often conceived as an aim or a commitment toward the truth (or knowledge) of the believed proposition: beliefs may fail to be true (to achieve that aim), not that they may fail to aim at truth.

That granted, a further question is how to interpret the claim that beliefs aim at truth. Philosophers conceive of truth-directedness in very different ways: as an intentional aim of the believer to accept a proposition if and only if it is true; as a function regulating our cognitive processes; as a norm requiring one to believe a proposition only if true; as a value attached to believing truly. In this section I remain neutral on the specific interpretations of the aim, postponing a discussion of these interpretations to §2. The objective of the present section is to introduce some properties commonly attributed to truth-directedness, independent of its specific interpretation. For ease of exposition, it will also be assumed that truth is the aim of belief until §3, where alternative candidates are considered.

Section 1.a introduces two properties commonly attributed to truth-directedness: (1) that it is a constitutive or essential feature of belief, and (2) that it is individuative of belief with respect to other mental attitudes. Section 1.b considers the differences between truth-directedness and other truth-related properties of belief such as the direction of fit and the value of having true beliefs. The truth-aim is usually attributed to belief in order to explain a number of characteristics of this attitude concerning its relation with truth. Section 1.c lists the main features that truth-directedness is supposed to explain.

a. The Aim as Constitutive and Individuative of Belief 

When philosophers attribute an aim to belief, they conceive of this property as constitutive of this type of attitude. This means, roughly, that it is part of what a belief is (that is, part of the essence or the concept of belief) that it is a mental attitude directed at the truth. Let us label this the constitutivity thesis. Depending on how we conceive truth-directedness, there will be different ways of working up to this thesis. If, for example, we interpret truth-directedness as a goal of the agent (compare §2.a), we can conceive of beliefs as analogous to acts like concealing (Steglich-Petersen, 2006, p. 512). Part of what it is to conceal an object X is that it is a type of act involving the goal that someone will not find X. It is in virtue of this goal that an action counts as an instance of concealing. Similarly, a way of stating the constitutivity thesis for belief is that it is part of what S’s believing that p is that S has an aim or goal (or that it is a function of S’s cognitive system) to retain that attitude only if it is true. It is in virtue of this aim of the agent who believes (or this function of her cognitive system) that that attitude counts as belief.

Alternatively, if one interprets truth-directedness as a norm to believe only the truth (compare §2.b), the constitutivity thesis amounts to understanding this norm by analogy to rules constitutive of practices like games (Wedgwood, 2002, p. 268). A practice is constituted by a set of rules if and only if it is part of what that practice is that this set of rules is in force for agents engaged in that practice (Glüer & Pagin, 1998). Consider a specific example: chess is a game constituted by a set of rules stating which moves are legal or permissible in the game. If one plays chess, one is thereby committed by the rules of the game to perform only legal moves. The performance of a particular act does not count as a chess-move if it cannot be assessed (justified, criticized…) according to the constitutive rules of the game. Similarly, if it is part of what a belief is that it is an attitude governed by a norm to believe only the truth, a mental attitude does not count as a belief if it cannot be assessed (criticized, justified…) on the basis of this norm, as right or correct if true and wrong or incorrect if false. One can also conceive of the constitutivity thesis by analogy to other types of entity essentially constituted by norms or values. For example, it is constitutive of what it is to be a citizen to be subject to certain rights and commitments, and it is constitutive of murder to be an act of killing in a wicked, inhumane, or barbarous way (for the latter example, see Dretske, 2000, pp. 243-245).

The claim that truth-directedness is constitutive of belief can be conceived of in at least two ways, as relative to the concept of belief or to its nature. According to the conceptual interpretation, it is a condition of understanding the concept of belief that we conceive of beliefs as mental attitudes directed toward truth (Boghossian, 2003; Engel, 2004; Shah, 2003). A proper understanding of the concept of bachelor implies conceiving of a bachelor as an unmarried man. Analogously, if one has a correct grasp of the concept of belief and conceives of a mental attitude as a belief, she understands it as one that, in some sense to be specified, is directed toward truth.

Other philosophers consider truth-directedness as constitutive of the nature or essence of belief (Brandom, 2001; Railton, 1994; Velleman, 2000a; Wedgwood, 2002, 2007). The relation between belief and truth-directedness is here conceived of as one of metaphysical dependence of the former on the latter: as it is essential to water that it has a certain chemical composition (H2O), it is essential to belief that it is an attitude involving a commitment to or an aim at truth. A mental attitude counts as a belief at least partially in virtue of aiming at the truth. It is simply impossible for an attitude to be a belief if it lacks this property.

It is usually held that the essentialist interpretation of the thesis does not entail the conceptual one (for example, Wedgwood, 2007, ch. 6). It is part of the essence of water, but not of its concept, that water is H2O—we can understand the concept of water without conceiving water as having that specific chemical composition. Similarly the truth-aim may be constitutive of the essence of belief but not of its concept (see Zangwill, 2005 for a similar view). Also, some philosophers have argued that the conceptual interpretation does not entail the essentialist one (Papineau, 2013; Shah, 2003, fn. 41; Shah & Velleman, 2005, fn. 43; Wedgwood, 2007).

The second property commonly attributed to truth-directedness is the individuativity of belief: the aim is the feature that individuates belief as that type of mental state and distinguishes beliefs from other mental attitudes (Engel, 2004; Lynch, 2009; Railton 1994; Velleman, 2000a; Wedgwood, 2002). Though many other attitudes entertain relations with truth (compare §1.b), it is claimed that belief is the only attitude aiming at truth. The truth-aim plays a fundamental role in sorting out beliefs from other mental attitudes, being the distinctive feature of beliefs with respect to other types of attitude like thoughts, suppositions, desires, and imagining.

Philosophers usually appeal to the individuativity of truth-directedness for belief for two main reasons: (1) singling out the aim as a peculiarly distinctive property of belief helps to achieve a better grasp of what truth-directedness is and to distinguish this property from other properties of belief (a philosopher who assumes individuativity in order to define truth-directedness is Velleman, 2000a, pp. 247-252); and (2) individuativity provides an argument to the best explanation for the claim that belief aims at truth: as the argument goes, without assuming that belief’s truth-directedness has this peculiar individuative role, one cannot account for the difference between beliefs and other attitudes (Engel, 2004; Railton, 1994).

It has also been suggested that if truth-directedness is the distinctive feature of belief with respect to other mental attitudes, this would provide an argument for the claim that this property is also constitutive of belief (Lynch, 2009b, 81; McHugh & Whiting, 2014; Velleman, 2000a; Wedgwood, 2002). Here is a way in which this argument may proceed: if the truth-aim were not a necessary and constitutive feature of belief, it would be possible for a belief not to aim at truth. But then, assuming that the aim is the only feature distinguishing beliefs from other mental attitudes, it would be impossible to classify that attitude as a belief rather than as a different type of attitude. Thus, the truth-aim must be a feature that beliefs possess necessarily and essentially. The argument from individuativity is not the only one supporting the constitutivity of truth-directedness for belief. Since other arguments partially depend on normativist interpretations of the aim, they will be considered in 2.b.

A number of critics have pointed out that it is possible to distinguish beliefs from other types of attitude without stipulating that it involves a constitutive aim at truth. These philosophers identify the attitude of believing a proposition with that of merely holding it true or accepting it (Glüer & Wikforss, 2013; Vahid, 2009), or they take other dispositional or motivational properties of belief as distinctive of this type of attitude. For a discussion of some of these views see §2.c.

b. Differences between the Aim and Other Properties of Belief

According to many philosophers engaged in the present debate (in particular those endorsing teleological and normative interpretations), truth-directedness is supposed to characterize and distinguish belief from other types of mental attitude. This property is conceived of as unique to belief, not possessed by any other attitude. These philosophers are careful to distinguish it from other properties relating belief to truth that other attitudes also possess. In this subsection I will introduce some of these properties and explain in which respects they are supposed to differ from the aim of belief. Mentioning these other properties will provide a rough idea of what truth-directedness is not. However, before considering these properties, it is worth mentioning that some philosophers endorsing minimalist conceptions of truth-directedness tend to identify the aim with some of these properties; these alternative interpretations of the aim will be briefly mentioned in this subsection and considered in more detail in §2.c.

An obvious truth-related feature of belief is the fact that believing something is believing it to be true (Velleman, 2000a). In other words, beliefs have propositions as content, and propositions can be true or false. This property is obviously not individuative of belief, and thus cannot be identified with truth-directedness. All propositional attitudes share it with beliefs. For instance, believing that p is believing that p is true, hoping that p is hoping that p is true, imagining that p is imagining that p is true, and so on (Engel, 2004; Velleman, 2000a).

It is also commonly held that beliefs involve specific causal, functional, and dispositional-motivational roles with respect to action and behavior. Some of these roles determine another aspect under which beliefs are related to truth. Using Ramsey’s (1931) metaphor, beliefs are like maps by which we steer in the world and upon which we are disposed to act. Belief is an attitude involving dispositions to act and behave as if its content were true and to use it as premise in reasoning (Armstrong, 1973; Stalnaker, 1984). Some have argued that belief’s aim at truth can be identified with the possession of similar dispositional and functional properties. In response to this challenge, it has been argued that these properties are not sufficient to set belief apart from other mental attitudes, and thus to capture the distinctive relationship between belief and truth (Engel, 2004; Velleman, 2000a). Other types of attitude seem to possess these very same properties. For instance, attitudes like acceptance and pretense all seem to dispose the subject to act as if their content were true and have the same motivational role.

Another property commonly attributed to belief, and concerning the way it is related to truth, is its mind-to-world direction of fit. On the one hand, some attitudes, like desires, have a world-to-mind direction of fit: if what is desired is not the case, the world should be changed in order to fit what is desired, and not vice versa. On the other hand, other attitudes, like beliefs, have a mind-to-world direction of fit: if what is believed is not the case (that is, it does not fit what it is supposed to represent), the belief’s contents should be revised to fit the world, and not vice versa. This is only one way of fleshing out the distinction (see Frost, 2014 and Humberstone, 1992 for overviews of the distinction). Another popular way is to distinguish between cognitive and conative states, where cognitive states are such that the proposition in their content is regarded as something that is true, while conative states are such that they involve regarding the proposition in the content as something to be made true (Velleman, 2000a). It is difficult to evaluate the relation of the truth-directedness of belief with direction of fit, since this depends on which account of direction of fit one accepts, and there is no unique and undisputed account. Some philosophers seem to identify belief’s direction of fit with its aim at truth (Humberstone, 1992; Platts, 1979). Others (Engel, 2004; Shah & Velleman, 2005; Velleman, 2000a) distinguish the two features, arguing that other mental attitudes such as suppositions, assumptions, and imagining possess the same direction of fit as beliefs, and thus this property cannot be identified with truth-directedness, which is distinctive of beliefs. Notice that the persuasiveness of this argument depends on whether one endorses an account of direction of fit according to which other attitudes would have the same direction of fit as belief.

It is also important to distinguish the truth-directedness of belief from the value of possessing true beliefs. It has been argued that having true beliefs is something valuable (David, 2005; Horwich, 2006; Kvanvig, 2003; Lynch, 2004). We naturally prefer to have true rather than false beliefs, and tend to attribute some sort of value to true beliefs and disvalue to false ones. It seems to be a platitude that true beliefs are at least extrinsically and instrumentally valuable. For example, we might prefer true beliefs to false ones because the former are more conducive to the satisfaction of one’s desires and the avoidance of dangers. Some philosophers have argued that true beliefs have also epistemic value. For example, it has been argued that believing the truth is an intrinsically valuable cognitive success. Though one might expect there to be important connections between the two topics, the issue of whether true beliefs are valuable must be distinguished from the further issue of whether truth is the aim of belief. While the former is a matter of aims, goals, and evaluations extrinsic to the notion of belief (for example, the goal of believing truths and not believing falsehoods), the latter is a property intrinsic and constitutive of such a mental state (Vahid, 2006, 2009, p. 19). Another respect in which the two features must be distinguished is that the value of true beliefs is hardly individuative of beliefs: other types of mental state such as guesses, hypotheses, and conjectures are evaluable according to their being true or false. In spite of these important differences, some philosophers have suggested that the value of true beliefs can be at least in part related to and explained by the constitutive aim of belief, even if not identified with it (Engel, 2004; Lynch, 2004 Railton, 1994; Williams, 2002).

c. The Explanatory Role of the Aim

The hypothesis that beliefs involve an aim at truth has been used to explain a number of features specific to this mental attitude. Before considering such features, it is important to stress that not everyone who endorses some version of this hypothesis thinks that it can explain all of these features. The main features supposed to be explained by truth-directedness are the following:

  • The difficulty or impossibility of believing at will,
  • The infelicity of asserting Moorean sentences and the absurdity of having Moorean beliefs,
  • The normativity of mental content,
  • The motivational force of evidential considerations in deliberative contexts,
  • The nature of epistemic normativity and the norms governing belief and theoretical reasoning, and
  • The correctness standard of belief.

(1) As famously argued by Williams (1973), belief’s truth-aim would enable one to explain the difficulty of believing at will (see also Velleman, 2000a). Believing a proposition p at will would entail believing it without regard to whether p is true. However, if beliefs constitutively involve aiming at truth, the only considerations relevant to forming and maintaining a belief would be those in conformity to its constitutive aim; that is, truth-relevant considerations. Believing at will would thus be either impossible or very difficult. This line of argument has been widely discussed in the literature. For critical discussions see, for example, Frankish (2007); Hieronymi (2006); Setiya (2008); and Yamada (2012).

(2) Belief’s truth-directedness could also explain the infelicity of asserting Moorean sentences and the absurdity of thinking Moorean thoughts—sentences and thoughts having the form “I believe that p, but not p” (for example, Baldwin, 2007; Littlejohn, 2010; Millar, 2009; Moran, 1997; Railton, 1994). Though these sentences are not self-contradictory, if asserted, they sound odd and infelicitous. As Moore (1942, p. 543) observes, this feature of belief-ascription seems to show that self-ascribing a belief in the first person carries with it an implied claim to the truth of the believed proposition. Similar ascriptions relative to many other mental states involve either no infelicity (there is no paradox in asserting “I assume that p but it is false that p”) or a contradiction (it is contradictory to assert “I know that p but it is false that p”). The infelicity of asserting Moorean sentences can be explained as follows: on the one hand, an assertion is an act by which the speaker commits herself to the truth of what she says; on the other hand, a belief is a mental state involving an aim at the truth of the believed proposition. We can also think of this aim as a sort of commitment (Baldwin, 2007; Millar, 2009; see §2.a for normative interpretations of the aim). The infelicity would thus be due to a conflict between the respective constitutive commitments or aims of assertion and belief. By asserting a Moorean sentence like “p and I do not believe that p,” a speaker would both endorse a commitment to the truth of p and deny such a commitment at the same time. This explanation can be easily extended to an explanation of the unreasonableness of Moorean thoughts and judgments, since a judgment, like an assertion, can be considered an act involving a commitment to the truth of what is adjudged.

(3) Many philosophers argue that mental content is normative (for an overview and references, see Glüer & Wikforss, 2010). This thesis is often interpreted as the claim that there are norms governing the correct use of concepts in the content of propositional mental attitudes. An example of such norms is, for instance, that the concept white is correctly applied to an object x if and only if x is white. Some have suggested that the aim of belief can provide an explanation of the normativity of mental content. In particular, Velleman (2000a) has suggested that the normativity of content can be entirely reduced to the truth-directedness of belief: if there is a norm governing mental content, this norm applies only to the contents of attitudes that aim at truth; that is, to beliefs. Boghossian (2003) has provided an argument according to which the normativity of mental content would derive from that of belief. First, he argues that the truth-directedness of belief has to be conceived as a norm constitutive of the concept of belief. Second, he argues that there is a constitutive connection between the notions of content and belief: our grasp of the concept of content depends on the grasp of the concept of belief. The normativity of content would thus be inherited by the normativity of belief. This argument has been the target of several criticisms; see, in particular, Glüer & Wikforss (2009) and Miller (2008).

(4) Belief’s truth-directedness has also been invoked to explain certain aspects of doxastic deliberation (namely deliberation concerning what to believe). One such aspect is the motivational force of evidential considerations in deliberative contexts. In particular, Shah (2003) and Shah and Velleman (2005) have argued that truth-directedness can explain doxastic transparency, the phenomenon according to which, in the context of doxastic deliberation, the question whether to believe that p is invariably settled by the answer to the further question whether p is true. Roughly, the idea is that when an agent engages in deliberation whether to believe a given proposition, only evidential (truth-regarding) considerations can be treated as reasons for believing. Other types of considerations (for example, practical) have no motivational force in the deliberation. This can be explained by the hypothesis that the concept of belief is constitutively governed by a norm to believe p only if p is true, and that in doxastic deliberation, the agent deploying that concept in the question whether to believe that p is motivated by the truth-norm to form a belief only if it is true. This in turn explains why only truth-relevant considerations matter in answering the question. Other philosophers have provided similar explanations of doxastic transparency—and more generally of the central role of evidence in deliberative belief-formation processes—compatible with non-normative interpretations of the truth-aim (for example, Steglich-Petersen, 2006, §5). It is worth noting here that similar explanations of the impossibility of believing in response to non-evidential considerations can also be used to explain the impossibility of believing at will (see (1) above).

(5) Belief’s aim has also been invoked to explain the various norms governing belief and theoretical reasoning, and to shed light on the nature of epistemic normativity in general. For example, according to Velleman, belief’s truth-directedness accounts for the justificatory force of theoretical reasoning. Theoretical reasoning justifies a belief by adducing considerations that indicate it to be true (2000a, p. 246). This is the case because being true is what satisfies the aim of belief. Other philosophers have argued that belief’s aim helps to explain norms of rationality and justification governing beliefs, and, more generally, the nature of epistemic normativity (Boghossian, 2003; Millar, 2004, 2009; Shah & Velleman, 2005; Sosa, 2007; Wedgwood, 2002, 2013). A common explanation takes these norms as instrumentally conducive to the satisfaction of the constitutive truth-aim of belief. This approach to epistemic normativity is not new in the literature. Many philosophers of the past have argued that epistemic standards of justification and rationality would be derivable from the fundamental goal of believing truly and avoiding falsehoods (for an overview see Alston, 2005, chs. 1 and 2). Criticisms of this type of approach to epistemic normativity typically mirror arguments against similar approaches in the practical domain. See, for example, Berker (2013); Firth (1981); Kelly (2003); and Maitzen (1995).

The various attempts to reduce or explain epistemic normativity in terms of a fundamental aim or norm of truth governing belief are considered by some philosophers as part of a wider project directed at providing analogous accounts for other normative domains. In particular, some have argued that practical normativity can be tracked back to constitutive norms of action and agency, which in turn would determine derivative norms of practical rationality and justification (Korsgaard, 1996; Shah, 2008; Velleman, 2000a; Wedgwood, 2007).

(6) According to some philosophers, the aim at truth would also explain why a belief is correct if and only if it is true, that is, the so-called correctness standard of belief (Steglich-Petersen, 2006, 2009; Velleman, 2000a). Philosophers endorsing teleological interpretations of the aim hold that the standard would be an instrumental assessment indicating the measure of success that a belief must attain in order to achieve its constitutive aim. However, this thesis is the subject of major disagreements. Philosophers giving normative interpretations of truth-directedness either identify the correctness standard with the constitutive aim of belief (Engel, 2007; Wedgwood, 2002), or argue for the independence of the two (Shah & Velleman, 2005).

2. Interpretations of the Aim

In the contemporary debate, there is a wide disagreement on how to interpret the claim that belief aims at truth. There are two main interpretations of the aim: teleological and normativist. According to teleological accounts, the aim of belief is an intentional purpose of subjects holding beliefs, or a functional goal of cognitive systems regulating the formation, maintenance, and revision of beliefs. Normativist accounts hold that the claim that beliefs have an aim must be interpreted metaphorically. According to normativists, truth-directedness is better understood as a commitment, a norm governing the regulation of beliefs (their formation, maintenance, and revision). Other philosophers have endorsed minimalist accounts of truth-directedness, denying that beliefs aim at truth in a substantive sense.

a. Teleological Interpretations

Teleological (also called “teleologist”) interpretations hold that beliefs are literally directed at truth as an aim, an end, or a goal (telos in Greek). This aim would be realized in truth-conducive processes and practices of belief-regulation, whose role is the formation, maintenance, and revision of beliefs. An attitude would count as a belief only if it is formed and regulated by these processes and practices. An advantage commonly attributed to teleological interpretations is that these interpretations seem more compatible with a naturalistic account of belief than rival interpretations (in particular, normativist ones). The thought is that intentions, goals, or functions can be accounted for in naturalistic terms. Furthermore, this interpretation would naturally fit with broadly instrumentalist, naturalistically unproblematic conceptions of epistemic normativity and epistemic rationality (note however that these conceptions have been the target of many criticisms; for example, Berker, 2013; Kelly, 2003).

Teleological interpretations differ with respect to how they conceive the aim at truth. Some teleologists interpret the aim of belief as an intentional goal of the subject, like an interest to accept a proposition only if it is true. For example, according to Steglich-Petersen (2006), believing is accepting a proposition with the purpose of getting its truth-value right. According to such an interpretation, the aim is realized through deliberative practices like judgments, in which an agent accepts a proposition only if she has evidence in support of its truth, and maintains that acceptance in the absence of contrary evidence. Steglich-Petersen recognizes that many of our beliefs are regulated in entirely sub-intentional ways. However, he argues that only beliefs considered at an intentional level are connected to a literal aim:

cognitive states and processes that are not connected with any literal aim or intention of a believer can nevertheless count as ‘beliefs’ in virtue of […] being to some degree conducive to the hypothetical aim of someone intending to form a belief in the primary strong sense. (2006, p. 515)

Other philosophers have advanced sub-intentional interpretations of the aim, conceiving it as a functional goal of the attitude or the psychological system to form true beliefs and revise false ones. This function would be regulated at a sub-personal, often unconscious level. A similar approach has been defended by Bird (2007) and McHugh (2012b). Some authors also interpret certain functionalist accounts such as those of Burge (2003), Millikan (1984), and Plantinga (1993) as teleological in this sense (see, for example, McHugh, 2012a, fn. 6, 2012b, fn. 49).

The most popular interpretation of the aim is a “mixed” one, according to which truth-directedness would be constituted by both intentional and sub-intentional processes. In particular, Velleman (2000a) maintains that there is a broad spectrum of ways in which the aim can be regulated. While sometimes it is realized in the intentional aim of a subject in an act of judgment about a certain matter, at other times there are cognitive systems in charge of the regulation of belief designed to ensure the truth of such mental states. Such systems would carry out this function more or less automatically, not relying on the subject’s intentions. Other philosophers who distinguish between intentional and sub-personal levels of regulation of the aim are Millar (2004, ch.2); Sosa (2007, 2009).

A well-known objection to teleological accounts, provided by Owens (2003), is specifically directed at intentional and mixed interpretations of the aim (for similar objections see Kelly, 2003). Owens observes that if beliefs aim at truth as argued by teleologists, believing would be similar to guessing. Guesses are mental acts aiming at truth, in the sense that when one guesses, one strives to give the true answer to a question. As Owens writes,

a guesser intends to guess truly. The aim of a guess is to get it right: a successful guess is a true guess and a false guess is a failure as a guess. Someone who does not intend to guess truly is not really guessing. (2003, p. 290)

 According to a teleologist perspective, similar considerations are valid for belief, which is a mental state held with the purpose of holding it only if true. But there are at least two important disanalogies between the intentional aim involved in guessing and the aim of belief.

First, the aim of belief does not interact with other aims of the subject the way the truth-aim of guesses does. The aim of guessing (as well as that of other goal-directed activities) can interact with other goals and objectives of the subject, it can conflict with these other goals, and it can be weighed with them. In particular, when we guess, we integrate the truth-aim constitutive of guessing with other purposes, such as the practical relevance of guessing, and we consider guessing that p reasonable when aiming at the truth by means of a guess that p would maximize expected utility (Owens, 2003, p. 292). If beliefs, like guesses, constitutively involve an aim at truth, then we should expect that, on at least some occasions, we would weigh the aim of belief against other aims. For example, when engaged in deliberation about whether to believe a given proposition, our pursuit of the truth-goal may be constrained by other goals and purposes of the subject in the usual way. But belief’s aim does not work like this. A large reward to believe that today it is not sunny gives me a reason to try to believe it, but, in deliberation about what to believe, these considerations do not interact and cannot be weighed with the truth-aim of belief in the way they do with other aims and purposes of the subject. In this respect, belief appears to be “insulated” from all but one aim, in a way that aim-directed behaviors in general are not (McHugh, 2012a, p. 430).

The second disanalogy suggested by Owens is that, in guessing, we can exercise a kind of voluntary control that is not possible in the case of belief. The guesser can compare different considerations and then decide whether to terminate her inquiry and guess. Nothing similar happens in deliberation about whether to believe a given proposition, where one cannot decide when to conclude her inquiry and start believing a proposition. The deliberation is concluded more or less automatically and cannot be controlled by reflection on how best to achieve the aim. Given these disanalogies, Owens concludes that while a guess is an attitude regulated by an intentional aim at truth, belief is not.

Teleologists have provided some replies to Owen’s argument. In particular, it has been argued that the aim of belief does in fact interact and can be weighed with other aims (Steglich-Petersen, 2009); it has been denied that evidential considerations play the exclusively prominent role in belief-formation suggested by Owen’s argument (McHugh, 2012a; for a similar point, though not directly related to Owen’s argument, see Frankish, 2007); and it has been argued that the direct form of control we have on the formation of guesses, but not of beliefs, can be explained by the fact that belief is a mental state, while guessing is a mental act (Shah & Velleman, 2005).

A related problem for a teleological interpretation of the aim is that sometimes we are completely indifferent to certain matters, and sometimes we even prefer (have the goal or aim) not to have any belief on certain matters. Nevertheless, evidence for these truths constitutes reasons for us to believe them, and if presented with such evidence in normal circumstances, we cannot refrain from forming beliefs on these matters. This seems to show that truth-directedness, and more generally epistemic rationality, cannot be reduced to aim-directed activities in the common sense of the term (Kelly, 2003).

Another very popular argument against teleological accounts of truth-directedness is Shah’s (2003) “teleologist’s dilemma”. The dilemma relies on the following observation: on the one hand, in practices of doxastic deliberation—deliberation directed at forming a belief about a certain matter—considerations concerning the evidence in support of the truth of a given proposition are the only ones that are relevant in order to answer the question whether to believe that proposition (this is what Shah calls the phenomenon of doxastic transparency; compare §1.c). On the other hand, some belief-formation processes can be influenced by non-evidential factors (for example, cases of wishful thinking). In an attempt to explain these two types of belief formation, the teleologist is pushed in two incompatible directions: she can consider the truth-aim as a disposition so weak as to allow cases in which beliefs are caused by non-evidential processes, in which case she cannot account for the exclusive influence of evidential considerations in deliberative contexts of belief-formation; alternatively, in order to account for the exclusive role of evidence in doxastic deliberation, she can strengthen the disposition that constitutes aiming at truth so that it excludes the influence of non-truth-regarding considerations from such kinds of reasoning—but then she cannot accommodate non-deliberative cases in which non-evidential factors influence belief-formation. In either case, the teleologist cannot explain the truth-regulation of belief in both deliberative and non-deliberative contexts. Therefore, a teleologist interpretation of the aim is not sufficient alone to provide an explanation for the truth-directedness of beliefs in all processes of belief formation.

In order to address this problem, Shah & Velleman (2005) argue that belief is regulated by two levels of truth-directedness: a sub-intentional teleological mechanism responsible for weak regulation in non-deliberative contexts, and one conceived in normative terms, able to explain the strong truth-regulation in deliberative contexts (see §2.b). For accounts of the dilemma compatible with a teleologist perspective, see, for example, Steglich-Petersen (2006) and Toribio (2013). For other objections to the teleological account, see Engel (2004) and Zalabardo (2010).

b. Normative Interpretations

Another way of interpreting belief’s truth-directedness has been through normative terms. According to normativist accounts of the aim of belief, the claim that “belief aims at truth” is just a metaphorical way of expressing the thought that beliefs are constitutively governed by a norm prescribing (or permitting) one to believe the truth (or only the truth). For example, if Mary forms the belief that it is now 12 a.m., she does what the norm requires (that is, she possesses a right belief) if that proposition is true, and she violates the norm if that proposition is false. Many normativists identify the norm of belief with a standard of correctness:

(C) a belief is correct if and only if the believed proposition is true

These philosophers take this standard to be constitutive of the essence or the concept of belief: belief would be a mental state characterized by the fact of being correct if and only if it is true (see §1.a for more details on normativist interpretations of the constitutivity thesis). This interpretation of the aim is probably the most popular in the early 21st century. It has been defended by, among others, Boghossian (2003); Engel (2004, 2013); Gibbard (2005); Millar (2004); Shah (2003); Wedgwood (2002, 2007, 2013).

Let us here clarify a common confusion about the claim that belief is constitutively governed by a norm: that a truth-norm constitutively governs belief does not mean that all beliefs necessarily satisfy that norm. What is constitutive of belief is not the satisfaction of the norm (as a matter of fact, many beliefs happen to be false, and thus incorrect), but that the norm is in force and believers and their beliefs can be assessed and criticized according to it—as correct if the belief is true, and incorrect if it is false.

One of the best known arguments for a normative interpretation of truth-directedness, suggested by Shah (2003), is the argument to the best explanation of doxastic transparency. As mentioned in §2.a, transparency is the (alleged) phenomenon according to which the deliberative question whether to believe a given proposition p is invariably settled by answering the further question whether it is true that p. The two questions are answerable to the same set of considerations; that is, considerations concerning the evidence for or against the truth of p. This phenomenon is specific to deliberative contexts in which an agent explicitly considers whether to believe a given proposition. In such contexts, only evidential (truth-relevant) considerations can influence belief-formation. In contexts in which a subject forms a belief without passing through a deliberative process, on the contrary, non-evidential considerations could influence the belief-formation.

According to Shah, only a normative interpretation of the aim of belief can explain these facts—doxastic transparency, why this phenomenon is specific to doxastic deliberation, and the exclusive role of evidential considerations in deliberative contexts. The explanation is the following: let us assume that it is constitutive of the concept of belief that a belief is correct if and only if it is true. This is interpreted as the claim that someone believing a proposition p is under a normative commitment to believe p only if it is true. When a subject engages in doxastic deliberation and asks herself whether to believe a given proposition, she deploys the concept of belief. Assuming she understands this concept and is aware of its application conditions, she interprets this question as whether to form a mental attitude that she should have only if the proposition is true. This in turn determines a disposition to be moved only by considerations relevant to the truth of p. This explains transparency and the exclusive role of evidential considerations in deliberative contexts. In contrast, in non-deliberative contexts where belief-formation works at a sub-intentional level, the subject does not explicitly consider the question whether to believe p, does not deploy the concept of belief, and is not thereby motivated by the norm to regard only truth-relevant considerations as relevant in the process of belief-formation. For this reason, non-evidential factors can influence belief-formation in these contexts. In sum, Shah’s normativist account allows him to explain both the strong role of truth in the belief regulation in deliberative contexts, and its weak role in non-deliberative ones.

An objection to Shah’s argument is that it assumes an implausibly strong form of motivational internalism according to which the norm of belief necessarily and immediately motivates the agent when she recognizes and accepts it. This contrasts with the ways in which, in general, norms tend to motivate agents (McHugh, 2013; Steglich-Petersen, 2006).

Another argument for a normativist account of truth-directedness, suggested by Wedgwood (2002), is composed of two steps. First, it is argued that the correctness standard of belief expresses a relation of strong supervenience (correctness of a belief strongly supervenes on the truth of that belief’s content). The standard thereby articulates a necessary feature of belief: necessarily, all true beliefs are correct and all false beliefs are incorrect. Second, since the standard articulates a necessary feature of belief, it is an essential feature of beliefs. Both steps of the argument have been criticized (for example, Steglich-Petersen, 2008, pp. 277-278). Against the second step, one cannot infer from a thing necessarily possessing a certain property to the property being essential to the thing it is a property of—using a well-known example of Fine (1994, pp. 4-5), one cannot infer from the necessary claim that Socrates is the only member of the singleton having as its only member Socrates to the claim that it is essential to Socrates that he is the only member of that singleton. Against the first step, it has been argued that it relies on contentious assumptions about normative supervenience: it is an error to deduce from the supervenience of a normative property N over a non-normative property G the necessity of the claim that every object having property G also has property N. The most one can conclude is that, necessarily, if some object has property N in virtue of having property G, then anything with property G also has property N (where necessity here takes a wide scope on the conditional). For similar considerations on normative relations of supervenience, see Blackburn (1993, p. 132); and Steglich-Petersen (2008).

Other arguments often used in support of the normativist interpretation do not clearly favor this interpretation over alternative substantive conceptions of truth-directedness, such as teleological ones. For example, it has been argued that unless one assumes that belief is constitutively governed by a truth-norm, one is not in a position to distinguish beliefs from other cognitive propositional attitudes, such as assuming, thinking, or imagining. The assumption that belief is constitutively governed by a truth-norm has also been used in arguments to the best explanation of a number of features of belief such as (1) the infelicity of asserting Moorean sentences; (2) the disposition to rely on a believed proposition as a reason for action and a premise in practical reasoning (Baldwin, 2007, p. 83); and (3) the relation between belief, assertion, evidence, and action (Griffiths, 1962). See §1.c for discussion of some of these arguments. These various arguments have received formulations both in normativist and teleological terms (for teleological formulations, simply replace occurrences of “truth-norm” with “truth-aim”); for this reason, they do not favor either interpretation. It is also worth mentioning that the claim that belief is constitutively normative has received indirect support from views that, for independent reasons, hold that intentional attitudes in general are constitutively normative (Brandom, 1994; Millar, 2004; Wedgwood, 2007).

Though the normativist interpretation has been the most popular in the last two decades, it has also been the target of several criticisms. According to the No Guidance Argument, a truth-norm is incapable of guiding an agent in the formation and revision of her beliefs. One can conform one’s beliefs to a norm requiring one to believe only true propositions only by first forming beliefs about whether these propositions are true. The only way to follow this norm will thus be continuing to believe what one already believes. Such a norm would not provide any guidance as to what a subject should do in order to comply with it. More precisely, this norm would have no guiding role in processes of belief regulation (formation, maintenance, and revision). Versions of this argument have been given by Glüer & Wikforss (2009, pp. 44-45); Horwich (2006, p. 354); and Mayo (1963, p. 141). A reply to this argument consists in arguing that even if the truth-norm does not provide any direct guidance, it can guide belief regulation indirectly, via some other derived principle like norms of evidence and rationality (Boghossian, 2003; Wedgwood, 2002); or it could guide in specific contexts, such as in doxastic deliberation where an agent explicitly considers her evidence for a given proposition p with the aim of making up her mind about whether p (Shah & Velleman, 2005). For a further defense of the argument see Glüer & Wikforss (2010b, 2013).

Another criticism of the normative interpretation is that, in general, an agent subject to a norm should have some form of intentional control over the actions necessary to satisfy it and be free to choose whether to conform to the norm or not. These conditions on control and freedom to comply seem to be constraints on norms in general. However, belief formation is (at least often) an involuntary process and is realized at an automatic, non-inferential level. It is thus unclear how a truth-norm governing belief can satisfy the above constraints. For versions of this objection, see Glüer & Wikforss (2009); and Steglich-Petersen (2006).

Another problem for normative interpretations of truth-directedness concerns the formulation of the alleged norm of belief. If beliefs are constitutively governed by a truth-norm, it should be possible to state this norm in terms of some duty, prescription, or permission. However, all the suggested formulations seem to be affected by some problem. Bykvist and Hattiangadi (2007), in particular, consider several possible formulations of the norm and conclude that none of them is free from problems. The best known formulations are the following:

(1) For any S, p: S ought to (believe that p) iff p

(2) For any S, p: if S ought to (believe that p), then p

(3) For any S, p: S ought to (believe that p iff p)

All these formulations are flawed in some way. (1) implies that one ought to believe every true proposition, included trivial and uninteresting ones (see also Sosa, 2003 for a similar point). Furthermore, provided there are true propositions that it is impossible to believe (for example, it is raining and nobody believes that it is raining), (1) violates the commonly accepted rule according to which “ought” implies “can.” (2) seems not to be normatively interesting because it is unable to place any requirement on believers—if p is true, nothing follows from it about what S ought to believe; and if p is false, it only follows that it is not the case that S ought to believe that p; it does not follow that S ought not to believe that p. (3) is problematic for it does not allow one to derive claims about what one ought or ought not to believe. For example, from (3) and the falsity of p, one cannot derive that one ought not to believe p. Furthermore, (3) seems to be subject to the same objections raised against (1).

Bykvist and Hattiangadi raise similar objections to other formulations of the truth-norm. From this, they conclude that this general failure could be considered a clue that belief is not at all a normative concept, at least not in the way suggested by normativists. Many have considered this conclusion too hasty. First, even if all the available formulations of the norm are wrong, this does not mean that it is impossible to formulate the norm of belief in “ought” terms; it could simply mean that the right formulation is yet to come. Second, some have suggested alternative formulations that seem to avoid the above problems. For example, Whiting (2010) has suggested that interpreting the truth-norm as a norm of permissibility could avoid most of the problems. Other formulations avoiding these problems have been suggested by Littlejohn (2010), Fassio (2011), and Raleigh (2013). For a discussion, see Bykvist and Hattiangadi (2013). A third way of avoiding these objections is to deny that the norm of belief is a truth-norm (see §3).

A reply to the various considered objections to normative interpretations consists in abandoning a deontic conception of the truth-norm, according to which the norm would be like a prescription, directive, or permission. Adopting alternative non-deontic interpretations of the norm would allow one to avoid the various objections considered above. Some have suggested interpreting the normativity of belief in evaluative terms; that is, in terms of what it is good (in a certain sense of “good” to be specified) to believe (Fassio, 2011; McHugh, 2012b). Others have interpreted the norm of belief as involving a type of normativity sui generis (McHugh, 2014; Rosen, 2001, p. 621), as an ideal of reason (Engel, 2013), or as an “ought to be” in the Sellarsian sense, not requiring addressees of the norm to be capable of voluntarily following it (Chrisman, 2008).

For other criticisms of normativist interpretations of truth-directedness, see also Davidson (2001); Dretske (2000); Horwich (2006, 2013); and Papineau (1999, 2013). The common factor in these criticisms is the defense of the thesis that if there are norms governing beliefs, these are practical, contingent, and not constitutive of belief. Replies to some of the above objections are in Engel (2007, 2013); Shah & Velleman (2005); and Wedgwood (2013).

c. Minimalist Interpretations

The label “minimalist interpretations” is used here for a range of different views. The common factor of these views is that they deny that there is such a property as a truth-aim of belief, at least if one identifies it with some feature different from those considered in §1.b. Minimalists hold that the features supposedly explained by the aim of belief (see §1.c) can be explained by other properties commonly ascribed to these mental states, such as their causal, dispositional, functional, or motivational roles, or their direction of fit (Davidson, 2001; Dretske, 2000; Papineau, 1999).

Given the present characterization, one may wonder whether there is a clear-cut dividing line between teleological and minimalist views; in particular, between sub-intentional teleological views, identifying the aim at truth with functional mechanisms of the cognitive system, and dispositionalist and functionalist minimalist accounts. A way of distinguishing these two approaches is by looking at the dispositional or functional role distinctive of belief (compare McHugh, 2012a, fn. 6): while functionalist accounts congenial to teleological approaches to the truth-aim focus on the input side of belief’s functional role, and exclusively identify this role with a truth-directed goal (for example, forming true beliefs), minimalists think that the role that individuates belief is at least partially on the output side, and they are mainly concerned with practical roles of belief, such as satisfying the subject’s desires or providing reasons for action.

Some philosophers have endorsed accounts of belief according to which causal, dispositional, and/or functional roles of beliefs with respect to action and behavior would be sufficient to characterize and individuate this type of mental attitude. Some have argued that the specific relation between belief and truth can be fully explained by the functional role of beliefs of providing reasons for action. Others have argued that all that is necessary for an attitude to qualify as a belief is that it dispose the subject to behave in ways that would promote the satisfaction of his desires if its content were true. For similar views see, for example, Stalnaker (1984). Armstrong (1973) argues that an essential function of beliefs is moving a subject to action given the presence of suitable dominant desires and purposes, and locates in this causal role the peculiar difference between belief and other mental attitudes such as mere thoughts. Still others have identified the aim at truth of belief with its direction of fit (Humberstone, 1992; Platts, 1979).

A “deflationary” interpretation of truth-directedness has been defended by Vahid (2006, 2009). Vahid first considers the feature of accepting-as-true, introduced by Velleman (2000a), as common to all cognitive states (beliefs, assumptions, conjectures, imaginations…). He suggests that to capture the truth-directedness of belief, one should not add any further (teleological or normative) property to the fact that belief is an attitude of regarding-as-true. Rather, what is distinctive of belief according to Vahid is the specific way in which one regards-as-true a given proposition. While other attitudes involve regarding a proposition as true for the sake of something else, in order to reach certain specific goals (for example, assuming is regarding-as-true for the sake of argument, imagining involves regarding a proposition as true for motivational purposes), believing is regarding a proposition as true for its own sake, as an end in itself.

The main criticism directed at minimalist interpretations is that other mental states such as suppositions and assumptions possess these same properties (same causal, functional, dispositional, and motivational roles; same direction of fit) and, thus, that these properties are not sufficient alone to individuate the peculiar truth-directedness of belief, to explain the special features of belief listed in §1.c, and to distinguish beliefs from other types of mental attitude (Engel, 2004; Velleman, 2000a). For a reply, see, for instance, Zalabardo (2010, §10), who challenges the claim that a purely motivational conception of belief would not be sufficient to distinguish beliefs from other mental attitudes. See also Glüer & Wikforss (2009, p. 42).

3. What Does Belief Aim At?

There is debate concerning whether the aim of belief is truth, as has been traditionally argued, or some other property. Since the late 1990s, an increasing number of philosophers have defended the claim that knowledge is the fundamental aim or norm of belief. Upholders of this view are, among others, Adler (2002); Bird (2007); Huemer (2007); Littlejohn (2013); Peacocke (1999); Sutton (2007); and Williamson (2000). The best known defender of the thesis that belief would aim at knowledge is Williamson (2000). Williamson’s main motivations to hold this thesis derive from his view about the nature of knowledge and its relation to belief. Williamson criticizes the idea that it is possible to provide an analysis of knowledge in terms of other more fundamental notions. Rather, other epistemic notions such as belief and justification should be understood as derivative from the more fundamental notion of knowledge—this is the so-called Knowledge First approach in epistemology. In particular, Williamson suggests that belief be considered roughly as the attitude of treating a proposition as if one knew it. Knowledge would thus fix the standard of appropriateness or the success condition for a belief, and merely believing p without knowing it would be a sort of “botched knowing.” (2000, p. 47). In this sense, belief would not aim merely at truth but at knowledge.

A well-known argument for the knowledge aim is based on a parallel between assertion and belief. On the one side, many have argued that assertion is constitutively governed by a knowledge norm (Adler, 2002; Bird, 2007; Sutton, 2007; Williamson, 2000):

(KNA) one should assert p only if one knows p.

On the other hand, some philosophers have suggested that occurrent belief is the inner analogue of assertion (for example, Williamson, 2000, pp. 255-256). More precisely, the idea is that (flat-out) assertion is the verbal counterpart of a judgment, and a judgment is a form of occurrent (outright) belief. If so, it is plausible that assertion and belief are governed by the same norm, and knowledge would be the norm of belief too:

(KNB) one should believe p only if one knows p.

Similar arguments have been suggested by Adler (2002); Bird (2007); McHugh (2011); Sosa (2010, p. 48); Sutton (2007); and Williamson (2000). To this line of argument, it may be objected that knowledge is not the norm of assertion. Some philosophers have suggested counterexamples to this thesis (Brown 2008; Lackey 2007). Others have argued that assertion is governed by other norms such as truth or justification (Douven, 2006; Weiner, 2005). Another way to criticize this argument consists in challenging the similarity between belief and assertion, arguing in particular that they would not be governed by the same norms. Whiting (2013, pp. 187-188) provides some reasons why one should expect standards of belief and assertion to diverge: since assertion is an “external” act, involving a social dimension, in evaluating an assertion one might have to take into account the expectations and needs of interlocutors and the role of speech acts in the unfolding conversation. Furthermore, assertion is a potential source of testimony. In asserting, one takes on responsibility for others’ beliefs. All these considerations are extraneous to belief, which is a “private” state of mind. It would thus not be surprising if assertions were governed by more demanding epistemic standards than belief due to their social character and their communicative role. Brown (2012, pp. 137-144) provides another argument against the claim that assertion and belief share the same epistemic standard: she first argues that whether an assertion or belief is epistemically appropriate partially depends on its consequences (for example, the epistemic propriety of asserting varies with the stakes), and second, that the consequences of asserting p may differ from those of believing p. It follows that there can be cases in which it is epistemically appropriate for a subject to believe that p, but not to assert that p, and vice versa.

Considerations about versions of Moore’s paradox with “know” in place of “belief” provide another argument for the claim that knowledge is the aim or norm of belief (Adler, 2002; Gibbons, 2013; Huemer, 2007; Sutton, 2007). As it sounds absurd or infelicitous to assert sentences like “it is raining but I do not know it is raining,” it seems incoherent to believe that it is raining and at the same time that one does not know that it is raining. An explanation of this fact could be that knowledge is the aim or norm of belief. A subject believing that it is raining but that she does not know it would violate (KNB). This type of argument has been the target of two objections: first, some have argued that a weaker standard, like a truth-norm, would be sufficient to explain the absurdity of this sort of Moorean belief (see, for example, the explanation considered in §1.c). Second, it has been argued that while asserting Moorean propositions of the form “it is raining but I do not know it is raining” sounds absurd, there is no such absurdity in believing these same propositions. It seems both reasonable and appropriate to believe something even while believing not to know it (McGlynn, 2013; Whiting, 2013, pp. 188-189). Using an example in McGlynn (2013, p. 387), there is nothing unreasonable or incongruous for Jane to believe that her ticket will lose, that this belief is justified, and that nonetheless this belief fails to count as knowledge. A reply to the latter criticism consists in distinguishing between outright (or full) belief and partial belief. Only outright belief would be subject to a knowledge norm. For a reply, see McGlynn (2013, §3) and Whiting (2013, p. 189).

Another argument for the knowledge aim/norm of belief is provided by the way in which we tend to assess (justify and criticize) our beliefs. Williamson (2005, p. 109) provides the following case: John is at the zoo and sees what appears to him to be a zebra in a cage. The animal in the cage is really a zebra. However, unbeknownst to John, to save money, most of the other animals in the zoo have been replaced by cleverly disguised farm animals. In this scenario, John’s belief is true and fully reasonable (after all, he has no reason to believe that the animal in the cage could not be a zebra). Still, John does not know it is a zebra. Intuitively, John needs an excuse for believing that the animal is a zebra. He can excuse himself by pointing out that he is in no position to distinguish his state from one of knowing. The need for an excuse indicates that it is wrong for John to believe that the animal is a zebra. Despite the fact that John’s belief is both reasonable and true, it is somewhat defective. This contrasts with knowing that it is a zebra, which provides a full justification for believing it, not a mere excuse. A reply to this argument is that John is not wrong in believing that the animal is a zebra (and thus does not need an excuse for this). Rather, if John stands in need of correction, this is due to the false background beliefs he has—for example, the belief that animals in this area are all zebras (Littlejohn, 2010; Whiting, 2013).

Some philosophers have argued that knowledge is the aim or norm of belief on the ground that knowledge has more value than mere true belief (Bird, 2007). However, on the one hand, there is no general agreement on whether knowledge is more valuable than true belief. On the other hand, from the fact that knowledge is more valuable than true belief, it does not follow that belief is governed by a norm or involves a constitutive aim at knowledge.

For other arguments in support of the claim that the aim or norm of belief is knowledge, see Bird (2007); Engel (2004); McHugh (2011); and Sutton (2007). For other criticisms of the claim, see Littlejohn (2010); McGlynn (2013); and Whiting (2013).

Though truth and knowledge are widely identified as the main candidates for being the aim or norm of belief, some philosophers have suggested other properties. Another available option is that the aim of belief is reasonability or justification. For a defense of the claim that non-factive justification is the condition of epistemic success for belief, see, for example, Feldman (2002, pp. 378-379). It should be noted that this view has not found many proponents in the 21st century literature, at least if we exclude philosophers for whom justified belief is factive or requires knowledge (for example, Gibbons, 2013; Littlejohn, 2012; Sutton, 2007). An original approach is that of Smithies (2012), who argues that the fundamental norm of belief is that one be in a position to know what one believes, where “[o]ne is in a position to know a proposition just in case one satisfies all the epistemic, as opposed to psychological, conditions for knowledge, such as having ungettiered justification to believe a true proposition.” (2012, p. 4)

Some philosophers have identified the aim of belief with some specific kind of epistemic virtue one could manifest in the possession of belief (Zagzebski, 2004) or with understanding (Kvanvig, 2003). According to other philosophers, the fundamental aim of belief consists in the satisfaction of practical goals such as survival, utility, or the satisfaction of desires and wants. Similar views are particularly popular among philosophers favoring naturalistic approaches in the philosophy of mind. For example, Millikan (1984) argues that beliefs are integrated in a naturally selected cognitive system having the function of tracking features of the world in order to help in the satisfaction of biological needs such as survival and reproduction. A similar view has been more or less explicitly endorsed by, for example, Horwich (2006); Kornblith (1993); and Papineau (1999, 2013).

Another option is that belief possesses multiple aims or norms equally fundamental and irreducible to each other. For example, Weiner (2014) endorses pluralism about epistemic norms, arguing that there are many different epistemic norms, each valid from a different standpoint, and that no one of these standpoints need be better than another. In a virtue-theoretic framework, Wright (2014) argues that there are two fundamental epistemic aims: believing in accordance with the intellectual virtues (such as intellectual courage, carefulness, and open-mindedness), and believing the truth and avoiding falsehoods.

4. Relevance of the Topic

The topic discussed in the present article has relevance for several more general philosophical debates. It is clearly relevant in those areas of philosophy directly concerned with the notion of belief, such as the Philosophy of Mind, and in particular, the ontology of mental attitudes. One of the issues that traditionally have most concerned philosophers is that of individuating the distinctive feature of belief with respect to other mental attitudes such as trusting, mere thinking, imagining, guessing, and so on. As David Hume admits (1739, book I, part III, §7), the distinction between belief and mere thought was the first philosophical problem that the Scottish philosopher posed himself, and also one of the hardest he found to solve (for a discussion, see Armstrong, 1973, part I, §5). Whereas the difference between belief and other types of mental attitude seems to reside in the specific relationship that belief entertains with truth (or knowledge), it has been extremely difficult to grasp the peculiar nature of such a relationship. The 21st century debate on the aim of belief promises to provide an answer to such a problem. In answering questions about the nature of the aim (see §2), it also promises to shed some light on the issue of whether belief is a normative attitude or whether it can be characterized by a fully naturalistic account.

The progress in the debate about the aim/norm of belief also substantially contributes to the study of norms and aims of other attitudes such as desires, emotions, and intentions. The aim of such studies is to provide a unified and coherent representation of the various norms and aims of mental attitudes and of their reciprocal relations (for example, Millar, 2009; Railton, 1997; Shah, 2008; Velleman, 2000b; Wedgwood, 2007). For instance, Wedgwood (2007) interprets the aim of attitudes as norms of correctness and argues that similar norms are constitutive and individuative of all intentional attitudes. Similarly, Shah (2008) applies an argument analogous to the transparency argument for belief (considered in §2.b) to other attitudes. In particular, he argues that the hypothesis that the concept of intention is governed by a constitutive norm would best explain the presumed fact that in order to conclude deliberation on whether to intend to A one must answer the question whether to A.

The debate on the aim of belief has, also, a particular relevance for certain views in philosophy of normativity. According to a prominent view in meta-ethics (so-called Constitutivism), normative facts can be grounded in facts about the constitution of action or agency. According to this view, agency is constitutively governed by practical norms (for example, Korsgaard, 1996; Velleman, 2000b). Some philosophers have tried to extend the view to other normative domains such as epistemology and aesthetics. The epistemological analogue of Ethical Constitutivism holds that epistemic normativity can be grounded in the constitutive aim or norm of belief. For a criticism of constitutivist approaches, see Enoch (2006).

The view that epistemic normativity is grounded in a fundamental truth-aim of belief has deep roots in the 20st century history of epistemology. Many philosophers in the past have argued that there is a strict dependence relation between the fundamental aim or norm of belief (sometimes presented as a conjunction of two values or goals of believing truly and avoiding falsehoods) and other derivable normative epistemic standards such as justification and rationality. Versions of this view have been defended by many well-known epistemologists, including Chisholm, Goldman, Lehrer, Plantinga, Alston, Foley, and Sosa (see Alston, 2005, ch. 1, for an overview). Accounts of this relation differ depending on the notions of justification and rationality adopted by philosophers, and by how philosophers conceive the relation between the truth-aim and other derived normative properties (for example, consequentialist, deontologist, virtue-based…).

Another approach in epistemology concerned with the topic of the aim or norm of belief is the so-called Knowledge First approach introduced in §3. According to this view, knowledge has a prominent role among epistemic notions and constitutes the fundamental epistemic standard of assertion, belief, action, practical reasoning, and disagreement (compare “Knowledge Norms”). This approach has generated a comparative study of these standards (for example, Smithies, 2012). An example is the reformulation of various arguments for the knowledge norm of assertion in order to defend other knowledge norms, such as those of belief and action (compare §3). In this perspective, the debate on the aim of belief can help in understanding important aspects of epistemic norms of assertion, action, practical reasoning, and disagreement, and in turn can receive important contributions from advances in the debates about norms governing these other practices.

5. References and Further Reading

  • Adler, J. (2002). Belief’s own ethics (Vol. 112). MIT Press.
  • Alston, W. (2005). Beyond “justification”: Dimensions of epistemic evaluation (Vol. 81). Ithaca: Cornell University Press.
  • Armstrong, D. M. (1973). Belief, truth and knowledge (Vol. 24). London: Cambridge University Press.
  • Baldwin, T. (2007). The normative character of belief. In M. S. Green & J. N. Williams (Eds.), Moore’s paradox: New essays on belief, rationality, and the first person (pp. 76–89). Oxford: Oxford University Press.
  • Berker, S. (2013). The rejection of epistemic consequentialism. Philosophical Issues, 23(1), 363–387.
  • Bird, A. (2007). Justified judging. Philosophy and Phenomenological Research, 74(1), 81–110.
  • Blackburn, S. (1993). Essays in quasi-realism. New York, NY: Oxford University Press.
  • Boghossian, P. A. (2003). The normativity of content. Philosophical Issues, 13(1), 31–45.
  • Brandom, R. B. (1994). Making it explicit: Reasoning, representing, and discursive commitment. Cambridge, MA: Harvard University Press.
  • Brandom, R. B. (2001). Modality, normativity, and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
  • Brown, J. (2008). The knowledge norm for assertion. Philosophical Issues, 18(1), 89–103.
  • Brown, J. (2012). Assertion and practical reasoning: Common or divergent epistemic Standards? Philosophy and Phenomenological Research, 84(1), 123–157.
  • Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548.
  • Bykvist, K., & Hattiangadi, A. (2007). Does thought imply ought? Analysis, 67(296), 277–285.
  • Bykvist, K., & Hattiangadi, A. (2013). Belief, truth, and blindspots. In T. Chan (Ed.), The aim of belief (pp. 100–122). New York, NY: Oxford University Press.
  • Chrisman, M. (2008). Ought to believe. Journal of Philosophy, 105(7), 346–370.
  • David, M. (2005). Truth as the primary epistemic goal: A working hypothesis. In M. Steup & E. Sosa (Eds.), Contemporary debates in epistemology (pp. 296–312). Oxford: Blackwell.
  • Davidson, D. (2001). Comments on Karlovy Vary papers. In P. Kotatko (Ed.), Interpreting Davidson (pp. 285-308). Stanford, CA: CSLI Publications.
  • Douven, I. (2006). Assertion, knowledge, and rational credibility. Philosophical Review, 115(4), 449–485.
  • Dretske, F. (2000). Norms, history and the constitution of the mental. In Perception, knowledge and belief (pp. 242–258.). Cambridge: Cambridge University Press.
  • Engel, P. (2004). Truth and the aim of belief. In D. Gillies (Ed.), Laws and models in science (pp. 77–97). London: King’s College Publications.
  • Engel, P. (2007). Belief and normativity. Disputatio, 2(23), 179–203.
  • Engel, P. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 199–216.
  • Enoch, D. (2006). Agency, shmagency: Why normativity won’t come from what is constitutive of action. Philosophical Review, 115(2), 169–198.
  • Fassio, D. (2011). Belief, correctness and normativity. Logique Et Analyse, 54(216), 471-486.
  • Feldman, R. (2002). Epistemological duties. In P. Moser (Ed.), The Oxford handbook of epistemology (pp. 362–384). Oxford: Oxford University Press.
  • Fine, K. (1994). Essence and modality. Philosophical Perspectives, 8, 1–16.
  • Firth, R. (1981). Epistemic merit, intrinsic and instrumental. Proceedings and Addresses of the American Philosophical Association, 55(1), 5–23.
  • Frankish, K. (2007). Deciding to believe again. Mind, 116(463), 523–547.
  • Frost, K. (2014). On the very idea of direction of fit. Philosophical Review, 123(4), 429–484.
  • Gibbard, A. (2005). Truth and correct belief. Philosophical Issues, 15(1), 338–350.
  • Gibbons, J. (2013). The norm of belief. Oxford: Oxford University Press.
  • Glüer, K., & Pagin, P. (1998). Rules of meaning and practical reasoning. Synthese, 117(2), 207–227.
  • Glüer, K., & Wikforss, Å. (2009). Against content normativity. Mind, 118(469), 31–70.
  • Glüer, K., & Wikforss, Å. (2010a). The normativity of meaning and content. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Glüer, K., & Wikforss, Å. (2010b). The truth norm and guidance: A reply to Steglich-Petersen. Mind, 119(475), 757–761.
  • Glüer, K., & Wikforss, Å. (2013). Against belief normativity. In T. Chan (Ed.), The aim of belief. New York, NY: Oxford University Press.
  • Griffiths, A. P. (1962). On belief. Proceedings of the Aristotelian Society, 63(n/a), 167–186.
  • Hieronymi, P. (2006). Controlling attitudes. Pacific Philosophical Quarterly, 87(1), 45–74.
  • Horwich, P. (2006). The value of truth. Noûs, 40(2), 347–360.
  • Horwich, P. (2013). Belief-truth norms. In T. Chan (Ed.), The aim of belief (pp. 17–31). New York, NY: Oxford University Press.
  • Huemer, M. (2007). Moore’s paradox and the norm of belief. In S. Nuccetelli & G. Seay (Eds.), Themes from G.E. Moore (pp. 142–57). Oxford: Oxford University Press.
  • Humberstone, I. L. (1992). Direction of fit. Mind, 101(401), 59–83.
  • Hume, D. (1739). A treatise of human nature. Oxford: Oxford University Press.
  • Kelly, T. (2003). Epistemic rationality as instrumental rationality: A critique. Philosophy and Phenomenological Research, 66(3), 612–640.
  • Kornblith, H. (1993). Epistemic normativity. Synthese, 94(3), 357–376.
  • Korsgaard, C. M. (1996). The sources of normativity (Vol. 110). Cambridge: Cambridge University Press.
  • Kvanvig, J. L. (2003). The value of knowledge and the pursuit of understanding (Vol. 113). Cambridge: Cambridge University Press.
  • Lackey, J. (2007). Norms of assertion. Noûs, 41(4), 594–626.
  • Littlejohn, C. (2010). Moore’s paradox and epistemic norms. Australasian Journal of Philosophy, 88(1), 79–100.
  • Littlejohn, C. (2012). Justification and the truth-connection. Cambridge: Cambridge University Press.
  • Littlejohn, C. (2013). The Russellian retreat. Proceedings of the Aristotelian Society, 113, 293–320.
  • Lynch, M. (2004). True to life: Why truth matters. Cambridge, MA: MIT Press.
  • Lynch, M. (2009a). The value of truth and the Truth of Values. In A. Haddock, A. Millar, & D. Pritchard (Eds.), Epistemic value. Oxford: Oxford University Press.
  • Lynch, M. (2009b). Truth, value and epistemic expressivism. Philosophy and Phenomenological Research, 79(1), 76–97.
  • Maitzen, S. (1995). Our errant epistemic aim. Philosophy and Phenomenological Research, 55(4), 869–876.
  • Mayo, B. (1963). Belief and constraint. Proceedings of the Aristotelian Society, 64, 139–156.
  • McGlynn, A. (2013). Believing things unknown. Noûs, 47(2), 385–407.
  • McHugh, C. (2011). What do we aim at when we believe? Dialectica, 65(3), 369–392.
  • McHugh, C. (2012a). Belief and aims. Philosophical Studies, 160(3), 425–439.
  • McHugh, C. (2012b). The truth norm of belief. Pacific Philosophical Quarterly, 93(1), 8–30.
  • McHugh, C. (2013). Normativism and doxastic deliberation. Analytic Philosophy, 54(4), 447–465.
  • McHugh, C. (2014). Fitting belief. Proceedings of the Aristotelian Society, 114(2pt2), 167–187.
  • McHugh, C., & Whiting, D. (2014). The normativity of belief. Analysis, 74(4), 698–713.
  • Millar, A. (2004). Understanding people: Normativity and rationalizing explanation. New York, NY: Oxford University Press.
  • Millar, A. (2009). How reasons for action differ from reasons for belief. In S. Robertson (Ed.), Spheres of reason (pp. 140–163). New York, NY: Oxford University Press.
  • Miller, A. (2008). Thoughts, oughts and the conceptual primacy of belief. Analysis, 68(299), 234–238.
  • Millikan, R. G. (1984). Language, thought and other biological categories. Cambridge, MA: MIT Press.
  • Moore, G. E. (1942). A reply to my critics. In P. A. Schilpp (Ed.), The philosophy of G. E. Moore. Chicago, IL: Open Court.
  • Moran, R. A. (1997). Self-knowledge: Discovery, resolution, and undoing. European Journal of Philosophy, 5(2), 141–61.
  • Owens, D. J. (2003). Does belief have an aim? Philosophical Studies, 115(3), 283–305.
  • Papineau, D. (1999). Normativity and judgment. Proceedings of the Aristotelian Society, 73(73), 16–43.
  • Papineau, D. (2013). There are no norms of belief. In T. Chan (Ed.), The Aim of Belief (pp. 64–79). New York, NY: Oxford University Press.
  • Peacocke, C. (1999). Being known. Oxford: Oxford University Press.
  • Plantinga, A. (1993). Warrant and proper function. New York, NY: Oxford University Press.
  • Platts, M. B. (1979). Ways of meaning: An introduction to a philosophy of language. London: Routledge & K. Paul.
  • Railton, P. (1994). Truth, reason, and the regulation of belief. Philosophical Issues, 5, 71–93.
  • Railton, P. (1997). On the hypothetical and non-hypothetical in reasoning about belief and action. In G. Cullity & B. N. Gaut (Eds.), Ethics and practical reason (pp. 53–79). New York, NY: Oxford University Press.
  • Raleigh, T. (2013). Belief norms and blindspots. Southern Journal of Philosophy, 51(2), 243–269.
  • Ramsey, F. P. (1931). Foundations of mathematics and other logical essays. London: Routledge.
  • Rosen, G. (2001). Brandom on modality, normativity and intentionality. Philosophy and Phenomenological Research, 63(3), 611–623.
  • Setiya, K. (2008). Believing at will. Midwest Studies in Philosophy, 32(1), 36–52.
  • Shah, N. (2003). How truth governs belief. Philosophical Review, 112(4), 447–482.
  • Shah, N. (2008). How action governs intention. Philosophers’ Imprint, 8(5), 1–19.
  • Shah, N., & Velleman, J. D. (2005). Doxastic deliberation. Philosophical Review, 114(4), 497–534.
  • Smithies, D. (2012). The normative role of knowledge. Noûs, 46(2), 265–288.
  • Sosa, E. (2003). The place of truth in epistemology. In L. Zagzebski & M. DePaul (Eds.), Intellectual virtue: Perspectives from ethics and epistemology (pp. 155–180). New York, NY: Oxford University Press.
  • Sosa, E. (2007). A virtue epistemology: Apt belief and reflective knowledge, Volume I. Oxford: Oxford University Press.
  • Sosa, E. (2009). Knowing full well: The normativity of beliefs as performances. Philosophical Studies, 142(1), 5–15.
  • Sosa, E. (2010). Knowing full well. Princeton, NJ: Princeton University Press.
  • Stalnaker, R. (1984). Inquiry. Cambridge, MA: MIT Press.
  • Steglich-Petersen, A. (2006). No norm needed: On the aim of belief. Philosophical Quarterly, 56(225), 499–516.
  • Steglich-Petersen, A. (2008). Against essential normativity of the mental. Philosophical Studies, 140(2), 263–283.
  • Steglich-Petersen, A. (2009). Weighing the aim of belief. Philosophical Studies, 145(3), 395–405.
  • Sutton, J. (2007). Without justification. Cambridge, MA: MIT Press.
  • Toribio, J. (2013). Is there an “ought” in belief? Teorema: Revista Internacional de Filosofía, 32(3), 75–90.
  • Vahid, H. (2006). Aiming at truth: Doxastic vs. epistemic goals. Philosophical Studies, 131(2), 303–335.
  • Vahid, H. (2009). The epistemology of belief. London: Palgrave Macmillan.
  • Velleman, D. (2000a). On the aim of belief. In D. Velleman (Ed.), The possibility of practical reason (pp. 244–281). New York, NY: Oxford University Press.
  • Velleman, D. (2000b). The possibility of practical reason (Vol. 106). New York, NY: Oxford University Press.
  • Wedgwood, R. (2002). The aim of belief. Philosophical Perspectives, 16(s16), 267–97.
  • Wedgwood, R. (2007). The nature of normativity. New York, NY: Oxford University Press.
  • Wedgwood, R. (2013). Doxastic correctness. Aristotelian Society Supplementary Volume, 87(1), 217–234.
  • Weiner, M. (2005). Must we know what we say? Philosophical Review, 114(2), 227–251.
  • Weiner, M. (2014). The spectra of epistemic norms. In J. Turri & C. Littlejohn (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 201–218). Oxford: Oxford University Press.
  • Whiting, D. (2010). Should I believe the truth? Dialectica, 64(2), 213–224.
  • Whiting, D. (2013). Nothing but the truth: On the norms and aims of belief. In T. Chan (Ed.), The Aim of Belief. New York, NY: Oxford University Press.
  • Williams, B. (1973). Deciding to believe. In B. Williams (Ed.), Problems of the Self (pp. 136–51). Cambridge, MA: Cambridge University Press.
  • Williams, B. (2002). Truth and truthfulness: An essay in genealogy. Princeton, NJ: Princeton University Press.
  • Williamson, T. (2000). Knowledge and its limits. Oxford: Oxford University Press.
  • Williamson, T. (2005). Knowledge, context, and the agent’s point of view. In G. Preyer & G. Peter (Eds.), Contextualism in philosophy: Knowledge, meaning, and truth (pp. 91–114). New York, NY: Oxford University Press.
  • Wright, S. (2014). The dual-aspect norms of belief and assertion: A virtue approach to epistemic norms. In C. Littlejohn & J. Turri (Eds.), Epistemic norms: New essays on action, belief, and assertion (pp. 239–258). New York, NY: Oxford University Press.
  • Yamada, M. (2012). Taking aim at the truth. Philosophical Studies, 157(1), 47–59.
  • Zagzebski, L. (2004). Epistemic value and the primacy of what we care about. Philosophical Papers, 33(3), 353–377.
  • Zalabardo, J. L. (2010). Why believe the truth? Shah and Velleman on the aim of belief. Philosophical Explorations, 13(1), 1–21.
  • Zangwill, N. (2005). The normativity of the mental. Philosophical Explorations, 8(1), 1–19.

 

Author Information

Davide Fassio
Email: davide.fassio@unige.ch
University of Geneva
Switzerland

Ernst Cassirer (1874-1945)

Ernst CassirerErnst Cassirer was the most prominent, and the last, Neo-Kantian philosopher of the twentieth century. His major philosophical contribution was the transformation of his teacher Hermann Cohen’s mathematical-logical adaptation of Kant’s transcendental idealism into a comprehensive philosophy of symbolic forms intended to address all aspects of human cultural life and creativity. In doing so, Cassirer paid equal attention to both sides of the traditional Neo-Kantian division between the Geisteswissenschaften and Naturwissenschaften, that is, between the social sciences and the natural sciences. This is expressed most systematically in his masterwork, the multi-volume Philosophie der symbolischen Formen (1923-9). Here Cassirer marshaled the widest learning of human cultural expression—in myth, religion, language, philosophy, history, art, and science—for the sake of completing and correcting Kant’s transcendental program. The human being, for Cassirer, is not simply the rational animal, but the animal whose experience with and reaction to the world is governed by symbolic relations. Cassirer was a quintessential humanistic liberal, believing freedom of rational expression to be coextensive with liberation. Cassirer was also the twentieth century’s greatest embodiment of the Enlightenment ideal of comprehensive learning, having written widely-acclaimed histories of the ideas of science, historiography, mathematics, mythology, political theory, and philosophy. Though cordial with both Moritz Schlick and Martin Heidegger, Cassirer’s popularity was eclipsed by the simultaneous rise of logical positivism in the English-speaking world and of phenomenology on the European continent. His professional career was the victim, too, of the political events surrounding the ascendency of Nazism in German academies.

Table of Contents

  1. Biography
  2. Philosophy of Symbolic Forms
  3. Cultural Anthropology
  4. Myth
  5. Language
  6. Science
  7. Political Philosophy
  8. The Davos Conference
  9. References and Further Reading
    1. Cassirer’s Major Works
    2. Further Reading

1. Biography

Ernst Cassirer was born in 1874, the son of the established Jewish merchant Eduard Cassirer, in the former German city of Breslau (modern day Wrocław, Poland). He matriculated at the University of Berlin in 1892. His father intended that he study law, but Cassirer’s interest in literature and philosophy prevented him from doing so. Sampling various courses at the universities at Leipzig, Munich, and Heidelberg, Cassirer was first exposed to the Neo-Kantian philosophy by the social theorist Georg Simmel in Berlin. In 1896, Cassirer began his doctoral studies under Herman Cohen at the University of Marburg.

Cassirer’s interests at Marburg ran, as they would always, toward framing Neo-Kantian thought in the wider contexts of historical thinking. These interests culminated in his dissertation, Descartes: Kritik der Matematischen und Naturwissenschaftlichen Erkentniss (1899). Three years later, Cassirer published a similarly historical book on Leibniz’ System in seinen wissenschaftlichen Grundlagen (1902). Cassirer was also the editor of Leibniz’ Philosophische Werke (1906). His focus on the development of modern idealist epistemology and its foundational importance for the history of the various natural sciences and mathematics reached its apex in Cassirer’s three-volume Das Erkenntnisproblem in der Philosophie und Wissenschaft der neuren Zeit (1906-1920), for which he was awarded the Kuno Fischer Medal by the Heidelberg Academy. The first volume, Cassirer’s Habilitationschrift at the University of Berlin (1906), examines the development of epistemology from the Renaissance through Descartes; the second (1907) continues from modern empiricism through Kant; the third (1920) deals with the development of epistemology after Kant, especially the division between Hegelians and Neo-Kantians up to the mid-twentieth century; and the fourth volume of Das Erkenntnisproblem on contemporary epistemology and science was written in exile in 1940, but only published after the end of the war in 1946.

Although his quality as a scholar of ideas was unquestioned, anti-Jewish sentiment in German universities made finding suitable employment difficult for Cassirer. Only through the personal intervention of Wilhelm Dilthey was Cassirer given a Privatdozent position at the University of Berlin in 1906. His writing there was prolific and continued the Neo-Kantian preoccupation with the intersections among epistemology, mathematics, and natural science. Cassirer’s work on, and with, Einstein exemplifies the quality of his contributions to the philosophy of science: Der Substanzbegriff und der Funktionsbegriff (1910), and Zur Einstein’schen Relativitätstheoretische Betrachtung (1921). These works also mark Cassirer’s conviction that an historian of ideas could make a major contribution to the most contemporary problems in every field.

After the First World War, and in the more tolerant Weimar Republic, Cassirer was invited to a chair at the new University of Hamburg in 1919. There, Cassirer came into the cultural circle of Erwin Panofsky and the Warburg Library of the Cultural Sciences. Immediately Cassirer was absorbed into the vast cultural-anthropological data collected by the Library, affecting the widest expansion of Neo-Kantian ideas into the previously uncharted philosophical territories of myth, the evolution of language, zoology, primitive cultures, fine art, and music. The acquaintance with the Warburg circle transformed Cassirer from a student of the Marburg School’s analysis of the transcendental conditions of thinking into a philosopher of culture whose inquisitiveness touched nearly all areas of human cultural life. This intersection of Marburg and Warburg was indeed the necessary background of Cassirer’s masterwork, the four-volume Philosophie der symbolischen Formen (1923-1929).

In addition to his programmatic work, Cassirer was a major contributor to the history of ideas and the history of science. In conscious contrast with Hegelian accounts of history, Cassirer does not begin with the assumption of a theory of dialectical progress that would imply the inferiority of earlier stages of historical developments. By starting instead with the authors, cultural products, and historical events themselves, Cassirer instead finds characteristic frames of mind that are defined by the kinds of philosophical questions and responses that frame them, which are in turn constituted by characteristic forms of rationality. Among his works at this time, which influenced a generation of historians of ideas from Arthur Lovejoy to Peter Gay are Individuum und Kosmos in der Philosophie der Renaissance (1927); Die Platonische Renaisance in England und die Schule von Cambridge (1932); Philosophie der Aufklärung (1932); Das Problem Jean-Jacques Rousseau (1932); and Descartes: Lehre, Persönlichkeit, Wirkung (1939). Cassirer’s philosophy of science had a similar influence on the historical analyses of Alexander Koyré and, through him, Thomas Kuhn.

In 1929, Cassirer was chosen Rektor of the University of Hamburg, making him the first Jewish person to hold that position in Germany. However, even as Cassirer’s star was rising, the situation for Jewish academics was deteriorating. With Hitler’s election as Chancellor came the ban on Jews holding academic positions. Cassirer saw the writing on the wall and emigrated with his family in 1933. He spent two years at Oxford and then six at Göteborg, where he wrote Determinismus und Indeterminismus in der modernen Physik (1936), Descartes: Lehre, Persönlichkeit, Wirkung (1939), and Zur Logik der Kulturwissenschaften (1942). In 1931, he wrote the first comprehensive study of the Swedish legal theorist and proto-Analytic philosopher, Axel Hägerström.

In 1941, Cassirer boarded the last ship the Germans permitted to sail from Sweden to the United States, where he would hold positions at Yale for two years and then at Columbia for one. His final books, written in English, were the career-synopsis, An Essay on Man (1944), and his first philosophical foray into contemporary politics, The Myth of the State (1946), published posthumously. Cassirer’s death in New York City on April 13, 1945, preceded that of Hitler and the surrender of Germany by mere weeks.

2. Philosophy of Symbolic Forms

“The Philosophy of Symbolic Forms is not concerned exclusively or even primarily with the purely scientific, exact conceiving of the world; it is concerned with all the forms assumed by man’s understanding of the world” (Philosophy of Symbolic Forms, vol. III, 13). For Cassirer, Neo-Kantianism was less about doctrinal allegiance than it was about a common commitment to explore the cognitive structures that underlie the variety of human experience. After the death of Cohen, Cassirer became increasingly interested in value and culture. Inspired by the Warburg Library, Cassirer cast his net into an ocean of cultural expression, trying to find the common thread that united the manifold of cultural forms, that is, to move from the critique of reason to the critique of culture.

As to what precisely symbolic forms are, Cassirer offers perhaps his clearest definition in an early lecture at the Warburg Library (1921):

By ‘symbolic form’ I mean that energy of the spirit through which a mental meaning-content is attached to a sensual sign and inwardly dedicated to this sign. In this sense language, the mythical-religious world, and the arts each present us with a particular symbolic form. For in them all we see the mark of the basic phenomenon, that our consciousness is not satisfied to simply receive impressions from the outside, but rather that it permeates each impression with a free activity of expression. In what we call the objective reality of things we are thus confronted with a world of self-created signs and images. (“Der Begriff der Symbolischen Form im Aufbau der Geisteswissenschaften”)

An illustration Cassirer uses is that of the curved line on a flat plane. To the geometer, the line means a quantitative relation between the two dimensions of the plane; to the physicist, the line perhaps means a relation of energy to mass; and to the artist, the line means a relation between light and darkness, shape and contour. More than simply a reflection of different practical interests, Cassirer believes each of these brings different mental energies to bear in turning the visual sensation of the line into a distinct human experience. No one of these ways of experiencing is the true one; though they each have their distinctive pragmatic uses within their individual fields. The task of the philosopher is to understand the internal directedness of each of these mental energies independently and in relation to the others as the sum total of human mental expression, which is to say, culture.

The first two forms Cassirer discusses, in the first two volumes respectively, are language and myth. The third volume of the Philosophy of Symbolic Forms concerns contemporary advances in epistemology and natural science: “We shall show how the stratum of conceptual, discursive knowledge is grounded in those other strata of spiritual life which our analysis of language and myth has laid bare; and with constant reference to this substructure we shall attempt to determine the particularity, organization, and architectonics of the superstructure – that is, of science” (Philosophy of Symbolic Forms, vol. III, xiii). Cassirer works historically, tracing the problem of philosophical knowledge through the Ancient Greeks up through the Neo-Kantian tradition. The seemingly endless battle between intuition and conceptualization has been contended in various forms between the originators of myths and the earliest theorists of number, between the Milesians and Eleatics, between the empiricists and rationalists, and again right up to Ernst Mach and Max Planck. Cassirer’s position here is conciliatory: both sides have and will continue to contribute their perspective on the eternal questions of philosophy insofar as both recognize their efforts as springing from the human’s multifaceted and spontaneous creativity—as symbol-forming rather than designating endeavors that in their dialectics, each with the other side, construct more elaborate and yet universal ways to navigate our world:

Physics gains this unity and extension by advancing toward ever more universal symbols. But in this process it cannot jump over its own shadow. It can and must strive to replace particular concepts and signs with absolutely universal ones. But it can never dispense with the function of concepts and signs as such: this would demand an intellectual representation of the world without the basic instruments of representation. (Philosophy of Symbolic Forms, vol. III, 479)

The fourth volume, The Metaphysics of Symbolic Forms, was published posthumously. Along with other papers left at the time of his death, the German original is now found in the first volume of Cassirer’s Nachgelassene Manuskripte und Texte, edited by John Michael Krois and Oswald Schwemmer in 1995. The English volume, assembled and edited by Donald Philip Verene and John Michael Krois in 1996, contains two texts from different periods in Cassirer’s writings. The first, from 1928, deals with human nature rather than metaphysics proper. In agreement with Heidegger, curiously, Cassirer seeks to replace traditional metaphysics with a fundamental study of human nature. Much of the thematic discussion of this part receives a refined and more complete expression in Cassirer’s 1944 Essay on Man. What is of novel interest here concerns his discussion of then contemporary philosophical anthropologists like Dilthey, Bergson, and Simmel and also the Lebensphilosophen, Schopenhauer, Kierkegaard, and Nietzsche, who otherwise receive short shrift in his work. His critical remarks of these latter thinkers involve their treatment of life as a new sort of metaphysics, one marred, however, by the sorts of dogmatism of pre-Kantian metaphysics.

The second text in Verene and Krois’s assembled volume comes from 1940, well after the project had been otherwise finished, and its theme is what Cassirer terms “basis phenomena”: phenomena so fundamental that they cannot be derived from anything else. The main basis phenomena concerns how the tripartite structure of the self’s personal relation to the environment is mirrored in a tripartite social structure of the “I,” the “you,” and that which binds society: “work.” Not to be confused with the Marxist conception of work, for Cassirer work is anything made or effected, any subjective operation on the objective world. The initial and most fundamental production of work, for Cassirer, is culture—the sphere in which the “I” and “you” come together in active life.

Several objections to Cassirer’s masterwork have been raised. First, the precise identity and number of forms is ambiguous over Cassirer’s corpus. In the lecture from 1921, Cassirer names language, myth-religion, and art as forms, but that number cannot be considered exhaustive. Even in his summatory Essay on Man, consecutive pages maintain different lists: “myth, language, art, religion, history, science” (222) and then “language, myth, art, religion, science” (223); elsewhere science is omitted (63); mathematics is sometimes added; and religion is sometimes considered part of mythic thinking. The first two of the four volumes of The Philosophy of Symbolic Forms—on language and myth respectively—would seem to indicate that each volume would treat a specific form. But the latter two volumes break the trend to deal with a host of different forms. Moreover, it is ambiguous how precisely those forms are related. For example, myth is sometimes treated as a primitive form of language and sometimes non-developmentally as an equal correlate. Arithmetic and geometry are the logic that undergirds the scientific symbolic form, but in no way do they undergird primitive forms of science that have been superseded. Whether the forms are themselves developmental or whether development takes place by the instantiation of a new form is also left vague. For example, Cassirer indicates that the move from Euclidean to non-Euclidean geometry involves not just progress but an entirely new system of symbolization. However, myth does not seem to develop itself into anything else other than into something wholly different, that is, representational language.

There is, however, a certain necessity to Cassirer’s imprecision on these points. Taken together, the Philosophy of Symbolic Forms is a grand narrative that exposits how various human experiences evolve out of an originally animalistic and primitive articulation of expressive signs into the complicated and more abstract forms of culture in the twenty-first century. As “energies of the spirit” they cannot be affixed with the kind of rigid architectonic featured in Kant’s transcendental deduction of purely logical forms. Though spontaneous acts of mental energy, symbolic forms are both developmental and pragmatic insofar as they adapt over time to changing environments in response to real human needs, something that resists an overly rigid structuralism. Those responses feature a loose sort of internal-logic, but one characterized according to contingent cultural interactions with the world. Therefore, one ought not to expect Cassirer to offer the same logical precision that comes with the typical Neo-Kantian discernment of mental forms insofar as logic is only one form among many cultural relations with life.

3. Cultural Anthropology

Cassirer’s late Essay on Man (1944) expresses neatly his lifelong attempt to combine his Neo-Kantian view of the actively-constituting subject with his Warburgian appreciation for the diversity of human culture. Here, as ever, Cassirer begins with the history of views up into his present time, culminating in the presentation of a definitive scientific thesis that he would then proceed to refute. Johannes von Uexküll’s  Umwelt und Innenwelt der Tiere (1909) argued that evolutionary biology has taken too far the view that animal parts and functions develop as a response to environmental factors. In its place Uexküll offers the “functional circle” of animal activity, which identifies the interaction of distinct receptor and effector systems. Animals are not simply reacting to the environment as it presents itself in sensory stimuli. They adapt themselves, consciously and unconsciously, to their environments, sometimes with clear signs of intelligence and insight. Different animals use diverse and sometimes highly complex systems of signals to better respond and manipulate their environments to their advantage. Dogs, for example, are adroit at reading signals in body language, vocal tones, and even hormone changes while being remarkably effective in expressing a complex range of immediate inner states in terms of the vocalized pitch of their whimpers, grunts, or barks, as well as the bends of their tails, or the posture of their spines. In Pavlov’s famous experiments, dogs were conditioned to react both to the immediate signals of meat—its visual appearance and smell—and also to mediate signals, like a ringing bell, to the same effect.

Cassirer thinks this theory makes good sense of the animal world as a corrective to a too-simple version of evolution, but doubts this can be applied to humans. Over and above the signals received and expressed by animals, human beings evolved to use symbols to make their world meaningful. The same ringing of the bell would not be considered by man a physical signal so much as a symbol whose meaning transcends its real, concrete stimulation. For man, a bell does not indicate simply that food is coming, but induces him to wonder why that bell might indicate food, or perhaps whether an exam is over, or the fulfillment of a sacrament, or that someone is on the telephone. None of those symbols would lead necessarily to a response in the way the conditioned dog salivates at the bell. They instead prompt a range of freely creative responses in human knowers within distinct spheres of meaning:

Symbols—in the proper sense of this term—cannot be reduced to mere signals. Signals and symbols belong to two different universes of discourse: a signal is a part of the physical world of being; a symbol is a part of the human world of meaning. Signals are ‘operators’; symbols are ‘designators’. Signals, even when understood and used as such, have nevertheless a sort of physical or substantial being; symbols have only a functional value. (Essay on Man 32)

Between the straightforward reception of physical stimuli and the expression of an inner world lies, for Cassirer, the symbolic system: “This new acquisition transforms the whole of human life. As compared with the other animals man lives not merely in a broader reality; he lives, so to speak, in a new dimension of reality” (Essay on Man 24). That dimension is distinctively Kantian: the a priori forms of space and time. Animals have little trouble working in three-dimensional space; their optical, tactile, acoustic, and kinesthetic apprehension of spatial distances functions at least as well as it does in humans. But only to the human is the symbol of pure geometrical space meaningful, a universal, non-perceptual, theoretical space that persists without immediate relationship to his or her own interaction with the world: “Geometrical space abstracts from all the variety and heterogeneity imposed upon us by the disparate nature of our senses. Here we have a homogenous, a universal space” (Essay on Man 45). In terms of time, too, there can be no doubt that higher animals remember past sensations, or that memory affects the manner in which they respond when similar sensations are presented. But in the human person the past is not simply repeated in the present, but transformed creatively and constructively in ways that reflect values, regrets, hopes, and so forth,

It is not enough to pick up isolated data of our past experience; we must really re-collect them, we must organize and synthesize them, and assemble them into a focus of thought. It is this kind of recollection which gives us the characteristic human shape of memory, and distinguishes it from all the other phenomena in animal or organic life. (Essay on Man 51)

As animals recall pasts and live within sensory space, human beings construct histories and geometries. Both history and geometry, then, are symbolic engagements that render the world meaningful in an irreducibly human fashion.

This symbolic dimension of the person carries him or her above the effector-receptor world of environmental facts and subjective responses. He or she lives instead in a world of possibilities, imaginations, fantasy, and dreams. However, just as there is a kind of logic to the language of contrary-to-fact conditionals or to the rules of poetic rhythym, so too is there a natural directedness expressed in how human beings construct a world of meaning out of those raw effections and receptions. That directedness cannot, however, be restricted to rational intentionality, though reason is indeed an essential component. In distinction from the Neo-Kantian theories of experience and representation, Cassirer thinks there is a wider network of forms that enable a far richer engagement between subject and object than reason could produce: “Hence, instead of defining man as an animal rationale, we should define him as an animal symbolicum” (Essay on Man 26).

With his definition of man as the symbolic animal, Cassirer is in position to reenvision the task of philosophy. Philosophy is much more than the analysis and eventual resolution of a set of linguistic problems, as Wittgenstein would have it, nor is it restricted, as it was for many Neo-Kantians, to transcendentally deducing the logical forms that would ground the natural sciences. Philosophy’s “starting point and its working hypothesis are embodied in the conviction that the varied and seemingly dispersed rays may be gathered together and brought into a common focus (Essay on Man 222). The functions of the human person are not merely aggregrate, loosely-connected expressions and factual conditions. Philosophy seeks to understand the connections that unite those expressions and conditions as an organic whole.

4. Myth

Max Müller was the leading theorist of myth in Cassirer’s day. In the face of Anglophone linguistic analysis, Müller held myth to be the necessary means by which earlier people communicate, one which left a number of traces within more-developed contemporary languages. What is needed for the proper study of myth, beyond this appreciation of its utility, is a step by step un-riddling of the mythical objects in non-mythical concepts so as to rationally articulate what a myth really means. Sigmund Freud, of course, also considered myth to be a sort of unconscious expression, one that stands as a primitive version of the naturally-occuring expression of subconscious drives.

Cassirer considers myth in terms of the Neo-Kantian reflex by first examining the conditions for thinking and then analyzing the objects which are thought. In his Sprache und Mythos (1925), which is a sort of condensed summary of the first two volumes of Philosophy of Symbolic Forms, Cassirer comes to criticize Müller, more so than Freud, for an unreflective realism about the objects of myth. To say that objects of any sort are what they are independent of their representation is to misunderstand the last century of transcendental epistemology. Accordingly, to treat myth as a false representation of those objects, one waiting to be “corrected” by a properly rational representation, is to ignore the wider range of human intellectual power. Naturalizing myths, as Müller and his followers sought to do, does not dissolve an object’s mythical mask so much as transplants it into the foreign soil of an alternative symbolic form:

From this point of view all artistic creation becomes a mere imitation, which must always fall short of the original. Not only simple imitation of a sensibly presented model, but also what is known as idealization, manner, or style, must finally succumb to this verdict; for measured by the naked ‘truth’ of the object to be depicted, idealization is nothing but subjective misconception and falsification. And it seems that all other processes of mental gestation involve the same sort of outrageous distortion, the same departure from objective reality and the immediate data of experience. (Language and Myth, trans. Langer [1946], 6)

Müller’s view of myth is a symptom of a wider problem. For if myth is akin to art or language in falsifying the world as it really is, then language is limited to merely expressing itself without any claim to truth either: “From this point it is but a single step to the conclusion which the modern skeptical critics of language have drawn: the complete dissolution of any alleged truth content of language, and the realization that this content is nothing but a sort of phantasmagoria of the spirit” (Language and Myth, trans. Langer [1946], 7). Cassirer rejects such fictionalism in myth and language both as an appeal to psychologistic measures of truth that fail to see a better alternative in the philosophy of symbolic forms.

For Cassirer, myth (and language, discussed below) does reflect reality: the reality of the subject. Accordingly, the study of myth must focus on the mental processes that create myth instead of the presupposed ‘real’ objects of myth:

Instead of measuring the content, meaning, and truth of intellectual forms by something extraneous which is supposed to be reproduced in them, we must find in these forms themselves the measure and criterion for their truth and intrinsic meaning. Instead of taking them as mere copies of something else, we must see in each of these spiritual forms a spontaneous law of generation; and original way and tendency of expression which is more than a mere record of something initially given in fixed categories of real existence. (Language and Myth, trans. Langer [1946], 8)

The mythic symbol creates its own “world” of meaning distinct from that created by language, mathematics, or science. The question is no longer whether mythic symbols, or any of these other symbolic forms, correspond to reality since it is distinct from that mode of representation, but instead it is a question on how myths relate to those other forms as limitations and supplementations. No matter how heterogeneous and variegated are the myths that come down to us, they move along definite avenues of feeling and creative thought.

An example Cassirer uses to illustrate his understanding of myth-making is the Avesta myth of Mithra. Attempts to identify Mithra as the sun-god, and thereby analogize it to the sun-god of the Egyptians, Greeks, and other early people, are misguided insofar as they stem from the attempt to explain away the object of mythical thinking in naturalistic rational terms. Cassirer points out that the analogy doesn’t hold for strictly interpretive reasons: Mithra is said to appear on mountain tops before dawn and is said to illuminate the earth at night as well, and cannot be the mythical analog of the sun. Mithra is not a thing to be naturalized, but evidence of an alternate spiritual energy that fashions symbolic responses to experiential confusions. What Mithra specifically reflects is a mode of thinking as it struggles to make sense of how the qualities of light and darkness result from a single essential unity: the cosmos.

As historical epochs provide new and self-enclosed worlds of experience, so too does myth evolve in conjunction with the needs of the age as an expression of overlapping but quite distinct patterns of mental life. Myths are hardly just wild stories with a particular pragmatic lesson. There is a specific mode of perception that imbues mythic thinking with its power to transcend experience. Similar to Giambattista Vico’s vision of historical epochs, Cassirer views the development of culture out of myth as a narrative of progressively more abstract systems of representation that serve as the foundation for human culture. Like Vico, too, there is continuity between the most elevated systems of theoretical expression of modern day—namely, religion, philosophy, and above all natural science—and a more primitive mind’s reliance upon myth and magic. However, Cassirer shares more with Enlightenment optimism than with Vico’s pessimistic conviction about the progressive degeneracy of scientific abstraction.

5. Language

The first volume of The Philosophy of Symbolic Forms (1923), on language, is guided by the search for epistemological reasons sufficient to explain the origin and development of human speech. Language is neither a nominal nor arbitrary designation of objects, nor, however, does language hold any immediate or essential connection to the object of its designation. The use of a word to designate an object is already caught in a web of intersubjectively-determined meanings which of themselves contain much more than the simple reference. Words are meaningful within experience, and that experience lies, as it did for Kant, as a sort of middle-ground between the pure reception of objects and the autonomous activity of reason to generate forms within which  content could be meaningful. In contrast to Kant and the Neo-Kantians, however, those forms cannot be presumed to be identical among all rational agents over the spans of history. Animal language is essentially a language of emotion, expressions of desires and aversions in response to environmental factors. Similarly the earliest words uttered by our primitive ancestors were signs to deal with objects, every bit a tool alongside other tools to deal with the primitive’s sensed reality. As the human mind evolved to add spatio-temporal intuitions to mere sensation, a representational function overtook the mind’s merely expressive operations. The primitive vocalized report of received sensations became representations of enduring objects within fixed spatial points: “The difference between propositional language and emotional language is the real landmark between the human and the animal world. All the theories and observations concerning animal language are wide of the mark if they fail to recognize that fundamental difference” (Essay on Man 30). The features of those objects were further abstracted such that from commonalities there emerged a host of types, kinds, and eventually universals, whose meaning allowed for the emergence of mathematics, science, and philosophy.

The animal’s emotive signals operate as a practical imagination in a world of immediate experience. Proper human propositional speech, on the other hand, is already imbued at even its most basic levels with theoretical structures that involve quintessentially spatio-temporal forms linking subjects and their objects: “Language has a new task wherever such relationships are signified linguistically, where ‘here’ is distinguished from ‘there,’ where the location of the speaker is distinguished from the one spoken to, or where the greater nearness or distance is rendered by various indicative particles” (“The Problem of the Symbol and its Place in the System of Philosophy” in Luft [2015], 259). The application of dimensionality, and temporality as well, transforms the subjective sensation into an objective representation. Prepositions, participles, subjunctives, conditionals, and the rest, all involve either temporal or spatial prescriptions, and none of them seems to be a feature of animal space. The older animalistic content is not entirely discarded as the same basic desires and emotions are expressed. The means of that expression, however, are formally of an entirely different character that binds the subject to the object in ways supposed to be binding for other rational agents. Although the interjection “ouch!” expresses pain well enough, and although animals have variously similar yelps and cries, it lacks the representational form of the proposition “I (this one, here and now) am (presently) in pain..” In the uniquely human sphere of ethics, too, the reliance on subjunctive and conditional verbal forms—“I ought not to have done that,” for example—always carries language beyond simple evocations of pleasures and aversions into the symbolic realm of meaningfulness.

The Neo-Kantian position on language allows Cassirer to address two contemporary anomalies in linguistic science. The first is the famous case of Helen Keller, the unfortunate deafblind girl from Alabama, who, with the help of her teacher Anne Sullivan, went on to become a prolific author and social activist. Sullivan had taught Helen signs by using a series of taps on her hand to correspond to particular sense impressions. Beyond her disabled sensory capacities, Cassirer argued, Helen was unable to cognize in the characteristically human way. One day at a water pump, Sullivan tapped “water” and Helen recognized the disjunction between the various sensations of water (varying temperatures, viscocities, and degrees of pressure) and the “thing” which is universally referred to as such. That moment opened up for Helen an entire world of names, not as mere expressive signals covering various sensations but as intersubjectively valid objective symbols. This discovery marked her entry into a new, symbolic mode of thinking: “The child had to make a new and much more significant discovery. She had to understand that everything has a name—that the symbolic function is not restricted to particular cases but is a principle of universal applicability which encompasses the whole field of human thought” (Essay on Man 34f).

The second case is the pathology of aphasia. Similar to Helen Keller, what had long been thought a deficiency of the senses was revealed by Cassirer to be a cognitive failing. In the case of patients with traumatic injuries to certain areas of the brain, particular classes of speech act became impossible. The mechanical operation of producing the words was not the problem, but an inability to speak objectively about “unreal” conditions: “A patient who was suffering from a hemiplegia, from a paralysis of the right hand, could not, for instance, utter the words: ‘I can write with my right hand,’ because this was to him the statement of a fact, not of a hypothetical or unreal case” (Essay on Man 57). These types of aphasiacs were confined to the data provided by their sense impressions and therefore could not make the crucial symbolic move to theoretical possibility. For Cassirer, this was good evidence that language was neither mere emotional expression nor free-floating propositional content that could be analyzed logically only a posteriori.

In addition to these cases of abnormal speech pathology, Cassirer’s attention to the evolution of language enabled him to take a much wider view of both the form of utterance and its content than his more famous counterparts among the linguistic analysts. In Carnap’s Logical Syntax of Language, for example, the attempt is made to reduce semantic rules to syntax. The expected outcome was a philosophical grammar, a sound and complete system of words in the sort of logical relation that would be universally valid. For Cassirer, however, “human speech has to fulfill not only a universal logical task but also a social task which depends on the specific social conditions of the speaking community. Hence we cannot expect a real identity, a one-to-one correspondence between grammatical and logical forms” (Essay on Man 128). Contrary to the early analytical school, language cannot be considered a given thing waiting to be assessed according to independent logical categories, but instead needs to be assessed according to the a priori application of those categories to verbal expressions. Accordingly, the task of the philosopher of language must be refocused to account for the diversity and creativity of linguistic dynamics in order to better encapsulate the human rational agent in the fullest possible range of his or her powers.

6. Science

Cassirer was perhaps the last systematic philosopher to have both exhaustive knowledge of the historical development of each of the individual sciences as well as thorough familiarity with his day’s most important advancements. Substance and Function (1910) could still serve as a primer for the history of major scientific concepts prior to the twenthieth century. The first part examines the concepts of number, space, and a vast array of special problems such as Emil du Bois-Reymond’s “limiting concepts”; Robert Mayer’s methodological advancements in thermo-dynamics; the spatial continuities of atoms in the physics of Roger Boscovich and Gustav Fechner; Galileo’s concept of inertia; Heinrich Hertz’s mechanics; and John Dalton’s law of multiple proportions. Each of these is examined with a view toward the epistemological presuppositions that gave rise to those problems and how each scientist’s innovations represented a novel way of posing problems through an application of spatio-temporal concepts.

This historical survey allows Cassirer to offer his own contributions to these problems along recognizably Neo-Kantian lines in the second part of Substance and Function. Science cannot be considered a collection of empirical facts. Science discovers no absolute qualities but only qualities in relation to other qualities within a particular field, such as the concept of mass as the sum of relations with respect to external impulses in motion, or energy as the momentary condition of a given physical system. Concrete sensuous impressions are only transformed into empirical objects by the determination of spatial and temporal form. The properties of objects, in bringing them into meaningful discourse by means of measurement, are thus mathematized as a field of relations: “The chaos of impressions becomes a system of numbers; but these numbers first gain their denomination, and thus their specific meaning, from the system of concepts which are theoretically established as universal standards of measurement” (Substance and Function 149). Objects as they stand outside possible experience are not the proper subject matter of science, anymore than they are for mathematics. Proper science examines the logical connections among the spatio-temporal relationships of objects precisely as they are constituted by experience.

Abandoning the particular sensuous properties of objects for their logical relations as members of a system refocuses the scientific inquiry on how the natural world is symbolized by mathematical logic. Science becomes anthropomorphized insofar as whatever content is available to experience will be content that the human being spontaneuosly and creatively renders meaningful: “No content of experience can ever appear as something absolutely strange; for even in making it a content of our thought, in setting it in spatial and temporal relations with other contents we have thereby impressed it with the seal of our universal concepts of connection, in particular those of mathematical relations” (Substance and Function 150). However, this in no way reduces science to mere relativism of personal inner projections, as if one way of representing the world were no better than any other. Though we do not know objects independent of mental representation, scientific understanding functions objectively by fixing the permanent logical elements and their connections within a uniform manifold of experience: “The object marks the logical possession of knowledge, and not a dark beyond forever removed from knowledge” (Substance and Function 303). Thus, science is absolutely tied to empirical reality, by which Cassirer means the sum of logical relations through which humans cognize the world. Therefore science, too, as much as language or myth, symbolically constitutes the world in its particular idiom: “The symbol possesses its adequate correlate in the connection according to law, that subsists between the individual members, and not in any constitutive part of the perception; yet it is this connection that gradually reveals itself to be the real kernel of the thought of empirical ‘reality’” (Substance and Function 149).

This Neo-Kantian vision of science is not something Cassirer thinks stands to “correct” science as currently practiced. On the contrary, the great modern scientists themselves have assumed precisely the same view, though in terms lacking the proper philosophical rigor. Newton’s assumption of absolute space and time put science on its first firm foundation, and in doing so he had to relinquish a purely sense-certain view of experience. Space and time in classical physics fix natural processes within a geometric schema, and fix mass as a self-identical thing within infinitely different spaces and different times. What Newton failed to realize was that this vision of space and time imputed ideal forms into what he believed was the straightforward observation of real objects. Kant had already shown as much. James Clark Maxwell’s theory of light waves breaks with this system of transcribing observational circumstances with mathematical equations that associate spatial positions with affair-states. Maxwell’s spatial point simultaneously has two correlate directional quantities: the magnetic and electrical vectors, whose representations in mathematics are readily cognizable but whose observation as such is impossible. The theory of Maxwell was therefore functionally meaningful without requiring a substantial ontology behind it. The definitive theory of light he discovered was not about a permanent thing situated within space and time but a set of interrelated magnitudes that could be functionally represented as a universal constant.

Hermann Ludwig von Helmholtz was among the first natural scientists to properly acknowledge the difference between observational descriptions of reality and symbolic theoretical constructions of it. As Cassirer quotes Helmholtz:

[I]n investigating [phenomena] we must proceed on the supposition that they are comprehensible. Accordingly, the law of sufficient reason is really nothing more than the urge of our intellect to bring all our perceptions under its own control. It is not a law of nature. Our intellect is the faculty of forming general conceptions. It has nothing to do with our sense-perceptions and experiences unless it is able to form general conceptions or laws. (Essay on Man 220)

The alleged sensory manifold held so dear in naively realist science gave way before Helmholtz’s demonstration that such is an ideally defined totality according to the rule which distinguishes properties on the basis of numerical series. That ideal unit is, for Helmholtz, the “symbol,” which cannot be considered a “copy” of a non-signifying object-in-itself (for how could that be conceived?) but the functional correspondence between two or more conceptual structures. Thus what is discovered by Helmholtzian science are the laws of interrelation among phenomena, the laws which are the very condition of our experiencing something as an object in the first place.

To Helmholtz’s experimental demonstration, Cassirer is able to add the relational but still universal nature of scientific designation; that is, the crucial differentiation between substance-concepts and function-concepts:

For laws are never mere compendia of perceptible facts, in which the individual phenomena are merely placed end to end as on a string. Rather every law, as compared to immediate perception, comprises a […] transition to a new perspective. This can occur only when we replace the concrete data provided by experience with symbolic representations, which on the basis of certain theoretical presuppositions that the observer accepts as true and valid are thought to correspond to them. (The Philosophy of Symbolic Forms III, 21)

Accordingly, the truth of science does not depend upon an accurate conceptualization of substances so much as it does on the demonstrating the limits of conceptual thinking about those substances, that is, their symbolic functions.

The scientist cannot attain his end without strict obedience to the facts of nature. But this obedience is not passive submission. The work of all the great natural scientists – of Galileo and Newton, of Maxwell and Helmholtz, of Planck and Einstein—was not mere fact collecting; it was theoretical, and that means constructive, work. This spontaneity and productivity is the very center of all human activities. It is man’s highest power and it designates at the same time the natural boundary of our human world. In language, in religion, in art, in science, man can do no more than to build up his own universe – a symbolic universe that enables him to understand and interpret, to articulate and organize, to synthesize and universalize his human experience. (Essay on Man 221)

Cassirer’s essay Zur Einsteinschen Relativitätstheorie (1921) was his last major thematic enterprise before the first volume of The Philosophy of Symbolic Forms. In it he sees himself following Cohen’s task of updating Kant’s philosophical groundwork for science. Kant had taken for granted that the forms of science in his own day represented scientific thinking as such. His epistemological groundwork accordingly needed to support Newtonian physics. After Kant’s death, science leapt past the limits set by Newton just as mathematics pushed the limits of Euclidian three-dimensional geometry. Einstein’s theories of relativity effectively dismantled the authority of both; the fact that they did proved to Cassirer the non-absolute status of scientific symbolization as a doctrine about objects. An elucidation of the epistemological conditions that could allow for Einstein’s relativity was now necessary.

Cassirer replaced Kant’s static formalism with his attention to the varied and alterable features of mathematical science that could accomodate radical new forms of mathematical logic and, by extension, systems of natural science. Pure Euclidean geometry was so influential because it dealt concretely and intuitively with real things as uniform and absolute substances. And it still works with most material applications. When non-Euclidian geometry came to the fore with Gauss, Riemann, and Christoffel, it was considered a mere play of analytical concepts that held some logical curiosity but no applicability. Over time a gradual shift ensued from the widening of the concept of experience to include non-uniform concepts of space.

Pure Euclidean space stands, as it now seems, not closer to the demands of empirical and physical knowledge than the non-Euclidean manifolds but rather more removed. For precisely because it represents the logically simplest form of spatial construction it is not wholly adequate to the complexity of content and the material determinateness of the empirical. Its [i.e., the Euclidean] fundamental property of homogeneity, its axiom of the equivalence in the principal of all points, now marks it as an abstract space; for, in the concrete and empirical manifold, there never is such uniformity, but rather thorough-going differentiation reigns in it. (“Euclidean and non-Euclidean Geometry,” in Luft [2015], 243)

It is thus not the case, as traditionally thought, that the new physical sciences simply adopted a more abstract vision of mathematics as its basis. Their physics represent a more widely-encompassing symbolic representation that expresses a new mode of experience, one less concerned with the sense impressions of real objects than with the reality of their logical relations.

Einstein needed a geometry of curvature that varied according to the relation of mass and energy in order for general relativity to work, but this of itself does not mean Euclidean geometry was or even could be proven wrong by Minkowski space-time. In the terminology of symbolic forms, Cassirer thinks Einstein’s relativity has transcended the symbolic forms of natural objects with those of pure mathematical relations. The result is the fracture of non-commensurable ways of analyzing one and the same “substance”: physically, chemically, mathematically, and so forth. Those forms ought not to be reduced to a single “meta” method that levels their differences as merely partial views. Each ought to be retained as equally valid parts of the total determination of the object. Thus Einstein was right to abandon absolute Newtonian space-time for relative Minkowski space-time. But his reason for doing so did not concern the former’s falsity. In place of a single absolutist description, the new relativism embraced an epistemology that featured a wider variety of equally valid modes of thinking about one and the same object. Objects, in Cassirer’s idiom, are relative to the symbolic form under which they are expressed.

The One reality can only be disclosed and defined as the ideal limit of diversely changing theories; but the setting of this limit itself is not arbitrary; it is inescapable, since the continuity of experience is established only thereby. No particular astronomical system, the Copernican no more than the Ptolemaic…may be taken as an expression of the ‘true’ cosmic order, but only the whole of these systems as they continuously unfold in accordance with a certain context. …We do not need the objectivity of absolute things, but we do require the objective determinacy of the way of experience itself. (Philosophy of Symbolic Forms III, 476)

Cassirer’s view of the evolution of science may be compared with Thomas Kuhn’s view insofar as both reject a single consistent progress toward absolute truth. Cassirer’s symbolic forms echo in Kuhn’s paradigms as incommensurable frameworks of meaning that stand in discomfitted relationships with one another. But where Kuhn sees the conditions for shifted paradigms in the quasi-sociological language of the community crises brought about by insoluable intra-paradigm problems, Cassirer sees a more epistemological metamorphosis in the evolution and expansion of human thinking. More than just a professional and social shift away from Pythagoras or Galileo to Einstein or Plank, Cassirer thinks rational agency matures to embrace more variegated, more useful, and more precise symbols. This evolution does not bring the rational agent  closer to the truth of objects, but it does bring more useful and exacting means by which to think about those objects. Insofar as science, more so than myth or language, cultivates that progression through its activity, it presents, for Cassirer, the prospect to carry human nature to the very highest cultural achievements possible: “Science is the last step in man’s mental development and it may be regarded as the highest and most characteristic attainment of human culture” (Essay on Man 207).

7. Political Philosophy

Cassirer’s political philosophy has its roots in Renaissance humanism and the classics of Modern thought: Machiavelli, Rousseau, Kant, Goethe, and Humboldt. Ever concerned with a subject’s connection to the wider sphere of cultural life, Cassirer noted that the Ancient, Medieval, and Renaissance conceptions of politics were framed within a holistic worldview. In Modern times, a holistic order still obtained, but after Machiavelli, this order is based upon intrapersonal relationships rather than the divine or the natural. These social and political relationships are, like symbolic forms, neither entirely objective nor entirely subjective. They represent the construction of ourselves in the framework of our ideal comprehensive social life.

Man’s social consciousness depends upon a double act of identification and discrimination. Man cannot find himself, he cannot become aware of his individuality, except through the medium of his social life. […] Man, like the animals, submits to the rules of society but, in addition, he has an active share in bringing about, and an active power to change, the forms of social life. (Essay on Man 223)

As it did for Kant, human dignity derives from the capacity of rational agents to pose and constrain themselves by normative laws of their own making. Cassirer stresses against Marx and Heidegger, respectively, that it is neither the material nor ontological conditions that man is born or thrown into that determines political order or social value. Rather, it is the active processes by which the human person creates laws for themself, social institutions for themself, and norms for themself are paramount in determining the place of the human being in society. Politics is not simply the study of the relations between social institutions, as Marx and his sociological disciples believed, but of their meaningful construction within the symbolic forms of myth-making, art, poetry, religion, and science.

Human culture taken as a whole may be described as the process of man’s progressive self-liberation. Language art, religion, science, are various phases in this process. In all of them man discovers and proves a new power – the power to build up a world of his own, an ‘ideal’ world. Philosophy cannot give up its search for a fundamental unity in this ideal world (Essay on Man, 228).

The opponent in Cassirer’s last work, The Myth of the State, is Heidegger and the kind of twentieth century totalitarian mythologies of “crisis” by which he and so much of Germany were then entranced. Even if he did stand mostly alone, Cassirer stood firmly against the myth of Aryan supremacy, the myth of the eternal Jew, and the myth of Socialist utopia. He did not oppose the creative acts that gave rise to these myths but the unthinking allegiance they demanded of their acolytes. In so doing, Cassirer felt Germany, and not just Germany, had abandoned its heritage of classical liberalism, tradition of laws, and its belief in the rational progress of both science and religion for a worldview based in power and struggles for personal gain masking as equality. With obvious reference toward Heidegger and the National Socialists, Cassirer laments:

Perhaps the most important and the most alarming feature in this development of modern political thought is the appearance of a new power: the power of mythical thought. The preponderance of mythical thought over rational thought in some of our modern systems is obvious. (Myth of the State, 3)

Cassirer’s focus in Myth of the State is mostly not, however, the contemporary state of European politics. In fact, only in the last chapter is the word Nazi mentioned. The great majority is caught up instead with history, almost jarringly so given the immediate crisis and Cassirer’s personal place in it. He has far more to say about medieval theories of grace, Plato’s Republic, and Hegel than he does about the rise of Hitler or the War. Back in the First World War, Cassirer’s wife Toni would write in her biography, Mein Leben mit Ernst Cassirer, that despite some limited clerical duties on behalf of Germany, their major wartime concerns were whether there was sufficient electricity to write and whether the train tickets were first class (Toni Cassirer, 1948, 116-20): “We weren’t politicians, and didn’t even know any politicians” (Ibid., 117). And that aloofness stayed with Cassirer until the end. Charles W. Hendel, who was responsible for Cassirer’s appointment at Yale and who later became the posthumous editor of Myth of the State, illustrates how frustrating Cassirer’s silence on contemporary political matters were: “Won’t you tell us the meaning of what is happening today, instead of writing about past history, science, and culture? You have so much knowledge and wisdom—we who are working with you know that so well—but you should give others, too, the benefit of it” (Myth of the State x). In the early twentieth-first century, Edward Skidelsky declaimed Cassirer’s reluctance to speak about contemporary politics as a symptom of a greater philosophical shortcoming:

“[Cassirer’s] is an enchanting vision. But it is also a fundamentally innocent one. Liberalism may have triumphed in the political sphere, but it was the illiberal philosophy of Heidegger that won the day at Davos and went on to leave the deepest stamp on twentieth-century culture. Who now shares Cassirer’s faith in the humanizing power of art or the liberating power of science? Who now believes that the truth will make us free?” (Skidelsky 2008, 222)

8. The Davos Conference

The historical event for which Cassirer is best known is the famous conference held in Davos, Switzerland in 1929. Planned as a symposium to bring together French- and German-speaking academics in a spirit of international collaboration, the conference was set in the resort town made famous by Thomas Mann’s epic The Magic Mountain (1924). Counting nearly 1,300 attendees, more than 900 of who were the town’s residents, the conference featured 56 lectures delivered over the span of three weeks. Among those in attendance were contemporary heavyweights like Fritz Heinemann and Karl Joël, and rising stars like Emmanuel Lévinas, Joachim Ritter, Maurice de Gandillac, Ludwig Binswanger, and a young Rudolf Carnap. The centerpiece of the conference was to have been the showdown between the two most important philosophers in Germany: Cassirer and Heidegger. Curiously, there never was a disputation proper, in the sense of an official point-by-point debate, in part because neither man was up for it: Cassirer was bed-ridden by illness and Heidegger was less interested in attending lectures than the resort town’s recreational activities. As a characteristic expression of his disdain toward stuffy academic conferences, Heidegger even gave one of his own talks while wearing his ski-suit.

Cassirer was the student and heir of Hermann Cohen, the unchallenged leader of Marburg Neo-Kantianism. Heidegger was the most brilliant student of the Southwest Neo-Kantian Heinrich Rickert, but was recommended to the chair of Marburg by none other than Marburger Paul Natorp. On at least three separate occasions, Cassirer and Heidegger were considered for the same academic post, as successor to Husserl, then to Rickert, and finally for the leading position in Berlin in 1930 (Gordon, 2010, 40). Cassirer and Heidegger were thus the two greatest living thinkers in the tradition of Kantian philosophy, and were invited to Davos to defend their rival interpretation on the question of whether an ontology could be derived from Kant’s epistemology. Their positions were contradictory in clear ways: Cassirer held the Marburg line that Kant’s entire project required that the thing-in-itself be jettisoned for a transcendental analysis of the forms of knowing. Heidegger wanted to recast not only Kant but philosophy itself as a fundamental investigation into the meaning of Being, and by specific extension, the human way of Being: Dasein. The debate about the proper interpretation of Kant went nearly nowhere, and Heidegger’s interpretation had more to do with Heidegger than with Kant. Cassirer, the co-editor of the critical edition of Kant’s works and the author of a superb intellectual biography, was no doubt the superior exegete. Nevertheless, Heidegger was doubtless the more captivating and original philosopher.

Beyond their divergent interpretations of Kant, the debate brought to the fore two competing intellectual forces that were at genuine odds: Cassirer’s Neo-Kantian maintenance of the spontaneous mental freedom requisite for the production of symbolic forms was pitted against Heidegger’s existential-phenomenological concentration on the irrevocable “thrownness” of human beings into a world of which the common denominator was their realization of death. Cassirer thought Heidegger vastly overstated Dasein’s thrownness and understated its spontaneity, and that his subjectivism discounted the objectivity of the sciences and of moral laws. Also, if both the character of rationality and the inviolable value of the human person lie in a subject’s spontaneous use of theoretical and practical forms of reasoning, then the danger was clear: Heidegger’s Dasein had one foot in irrationality and the other in nihilism.

The historical significance of the Davos Conference thus lay, ironically, in its symbolic meaning. Primed by the cultural clash between humanism and iconoclasm represented by Thomas Mann’s characters Settembrini and Naphta, the participants in Davos expected the same battle between the stodgy old enlightenment Cassirer and the exciting, young, radical Heidegger. No doubt some in the audience fancied themselves a Hans Castorp, whose soul, and the very fate of Europe, was caught in the tug of war between Settembrini/Cassirer’s liberal rationalism and Naphta/Heidegger’s conservative mysticism. (Though, to be sure, Mann’s model for Naphta was György Lukács and not Heidegger.) In the Weimar Republic’s “Age of Crisis,” it was not so much what either man said, but what each symbolized that mattered. As Rudolf Carnap wrote in his journal, “Cassirer speaks well, but somewhat pastorally. […] Heidegger is serious and objective, as a person very attractive” (Friedman, 2000, 7). In a subsequent satirical reenactment, a young Emmanuel Lévinas mocked Cassirer by performing in buffo what he took to be the salient point of his lectures at Davos: “Humboldt, culture, Humboldt, culture” (Skidelsky, 2008, 1). Indeed what Cassirer defended was then subject to parody among the young. Cassirer was the last of the great polymaths like Goethe, the last comprehensive historian like Ranke, the last optimist like Humboldt, and the last of the Neo-Kantian academic establishment. Heidegger represented the revolution of a new German nation, one that would sweep away the old ways of philosophy as much as Hitler would sweep away Wilhelmine politics. Heidegger welcomed crisis as the condition for new growth and invention; Cassirer saw in crisis the collapse of a culture that took so long to achieve. Cassirer was the great scholar. Heidegger was the great philosopher. Cassirer clung to rational optimism and humanist culture while Heidegger championed existential fatalism. In 1929, the Zeitgeist clearly favored the latter.

The consequences of Davos, like the meaning of the conference itself, operated on two levels. On the level of the factual, Cassirer and Heidegger would maintain a somewhat detached respect for the other, with mutually critical yet professionally cordial responses in print over the years to come. Neither man came to change either his interpretation of Kant or his philosophy generally in any major way due to the conference. Symbolically, however, Davos was a disaster for Cassirer and for Neo-Kantianism. Europe was immediately swept up in increasingly violent waves of nationalism. Days after Hitler’s election as Chancellor in 1933, Jews were banned from teaching in state schools. The Night of the Long Knives happened five years after Davos, and then the Night of Broken Glass four years after that. Neo-Kantian philosophers, especially the followers and friends of Hermann Cohen, were mainly Jewish. Cassirer fled to England and then Sweden in 1933 in fear of the Nazi’s, even while Heidegger was made Rektor at Freiburg. The Wilhelmine era’s enlightened cultural humanism, and its last defender, had clearly lost.

9. References and Further Reading

What follows is a list of Cassirer’s major works. For an exhaustive bibliography, see http://www1.uni-hamburg.de/cassirer/bib/bibgr.htm. For the contents of Cassirer’s archive at Yale, see http://www1.uni-hamburg.de/cassirer/bib/yale1.htm.

a. Cassirer’s Major Works

  • (1899) Descartes: Kritik der Matematischen und Naturwissenschaftlichen Erkenntnis (dissertation at Marburg).
  • (1902) Leibniz’ System in seinen wissenschaftlichen Grundlagen. Marburg: Elwert.
  • (1906) Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit. Erster Band. Berlin: Bruno Cassirer.
  • (1907) Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit. Zweiter Band. Berlin: Bruno Cassirer.
  • (1910) Substanzbegriff und Funktionsbegriff: Untersuchungen über die Grundfragen der Erkenntniskritik. Berlin: Bruno Cassirer.
  • (1916) Freiheit und Form: Studien zur deutschen Geistesgeschichte. Berlin: Bruno Cassirer.
  • (1921) Zur Einsteinschen Relativitätstheorie. Erkenntnistheoretische Betrachtungen. Berlin: Bruno Cassirer.
  • (1923) Philosophie der symbolischen Formen. Erster Teil: Die Sprache. Berlin: Bruno Cassirer.
  • (1925) Philosophie der symbolischen Formen. Zweiter Teil: Das mythische Denken. Berlin: Bruno Cassirer.
  • (1925) Sprache und Mythos: Ein Beitrag zum Problem der Götternamen. Leipzig: Teubner.
  • (1927) Individuum und Kosmos in der Philosophie der Renaissance. Leipzig: Teubner.
  • (1929) Die Idee der republikanischen Verfassung. Hamburg: Friedrichsen.
  • (1929) Philosophie der symbolischen Formen. Dritter Teil: Phänomenologie der Erkenntnis. Berlin: Bruno Cassirer.
  • (1932) Die Platonische Renaissance in England und die Schule von Cambridge. Leipzig: Teubner.
  • (1932) Die Philosophie der Aufklärung. Tübingen: J.C.B. Mohr.
  • (1936) Determinismus und Indeterminismus in der modernen Physik. Göteborg: Göteborgs Högskolas Årsskrift.
  • (1939) Axel Hägerström: Eine Studie zur Schwedischen Philosophie der Gegenwart. Göteborg: Högskolas Årsskrift.
  • (1939) Descartes: Lehre, Persönlichkeit, Wirkung. Stockholm: Bermann-Fischer Verlag.
  • (1942) Zur Logik der Kulturwissenschaften. Göteborg: Högskolas Årsskrift.
  • (1944) An Essay on Man. New Haven: Yale University Press.
  • (1945) Rousseau, Kant, Goethe: Two Essays. New York: Harper & Row.
  • (1946) The Myth of the State. New Haven: Yale University Press.

b. Further Reading

  • Barash, Jeffrey Andrew (2008), The Symbolic Construction of Reality: The Legacy of Ernst Cassirer. Chicago: University of Chicago Press.
  • Bayar, Thora Ilin (2001), Cassirer’s Metaphysics of Symbolic Forms: A Philosophical Commentary. New Haven: Yale University Press.
  • Braun, H.J., Holhey H., & Orth, E.W. (eds.) (1998), Über Ernst Cassirers Philosophie des symbolischen Formen. Frankfurt: Suhrkamp.
  • Cassirer, Toni (1948), Mein Leben mit Ernst Cassirer. Hildesheim: Gerstenberg.
  • Friedman, Michael (2000), A Parting of the Ways: Carnap, Cassirer, and Heidegger. Peru, IL: Open Court.
  • Gaubert, Joël (1996), La science politique d’Ernst Cassirer: pour une réfondation symbolique de la raison pratique contre le mythe politique contemporain. Paris: Éd. Kimé.
  • Gordon, Peter E. (2010), Continental Divide: Heidegger, Cassirer, Davos. Cambridge: Harvard University Press.
  • Hamlin, C., & Krois, J.M. (eds.) (2004), Symbolic Forms and Cultural Studies: Ernst Cassirer’s Theory of Culture. New Haven: Yale University Press.
  • Hanson, J. & Nordin, S. (2006), Ernst Cassirer: The Swedish Years. Bern: Peter Lang.
  • Heidegger, Martin (1928), “Ernst Cassirer: Philosophie der symbolischen Formen. 2. Teil: Das mythische Denken.” Deutsche Literaturzeitung, 21: 1000–1012.
  • Itzkoff, Seymor (1971), Ernst Cassirer: Scientific Knowledge and the Concept of Man. South Bend, IN: Notre Dame University Press.
  • Krois, John Michael (1987), Cassirer: Symbolic Forms and History. New Haven: Yale University Press.
  • Langer, Suzanne (1942), Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, Mass.: Harvard University Press.
  • Lipton, D. R. (1978), Ernst Cassirer: The Dilemma of a Liberal Intellectual in Germany, 1914-1933. Toronto: University of Toronto Press.
  • Lofts, S.G. (2000), Cassirer: A “Repetition” of Modernity. Albany: SUNY Press.
  • Lübbe, Hermann (1975), Cassirer und die Mythen des zwanzigsten Jahrhunderts. Göttingen: Vandenhoeck & Ruprecht.
  • Luft, Sebastian (ed.) (2015), The Neo-Kantian Reader. New York: Routledge.
  • Paetzold, Heinz (1995), Ernst Cassirer — Von Marburg nach New York: eine philosophische Biographie. Darmstadt: Wissenschaftliche Buchgesellschaft.
  • Renz, Ursula (2002), Die Rationalität der Kultur: Zur Kulturphilosophie und ihrer transzendentalen Begründung bei Cohen, Natorp und Cassirer. Hamburg: Felix Meiner.
  • Rudolph, Enno (ed.) (1999), Cassirers Weg zur Philosophie der Politik. Hamburg: Felix Meiner.
  • Schilpp, Paul Arthur (ed.) (1949), The Philosophy of Ernst Cassirer. Evanston: Library of Living Philosophers.
  • Schultz, William (2000), Cassirer and Langer on Myth: An Introduction. New York: Routledge.
  • Schwemmer, Oswald (1997), Ernst Cassirer. Ein Philosoph der europäischen Moderne. Berlin: Akademie Verlag.
  • Skidelsky, Edward (2008), Ernst Cassirer: The Last Philosopher of Culture. Princeton: Princeton University Press.
  • Tomberg, Markus (1996), Der Begriff von Mythos und Wissenschaft bei Ernst Cassirer und Kurt Hübner. Münster: LIT Verlag.
  • Verene, Donald Phillip (ed.) (1979), Symbol, Myth, and Culture: Essays and Lectures of Ernst Cassirer, 1935-1945. New Haven: Yale University Press.
  • Verene, Donald Phillip (2011) The Origins of the Philosophy of Symbolic Forms: Kant, Hegel, and Cassirer. Evanston, IL: Northwestern University Press.

 

Author Information

Anthony K. Jensen
Email: Anthony.Jensen@providence.edu
Providence College
U. S. A.

Legal Hermeneutics

The question of how best to determine the meaning of a given text (legal or otherwise) has always been the chief concern of the general field of inquiry known as hermeneutics. Legal hermeneutics is rooted in philosophical hermeneutics and takes as its subject matter the nature of legal meaning. Legal hermeneutics asks the following sorts of questions: How do we come to decide what a given law means? Who makes that decision? What are the criteria for making that decision? What should be the criteria? Are the criteria that we use for deciding what a given law means good criteria? Are they necessary criteria? Are they sufficient? In whose service do our interpretive criteria operate? How were these criteria chosen and by whom? Within what sociopolitical, sociocultural, and sociohistorical contexts were these criteria generated? Are the criteria we have used in the past to ascertain the meaning of a given law the criteria we should still use today? Why or why not? What personal or political goals do the meanings of laws serve? How can we come up with better meanings of laws? On what basis can one meaning of a given law be justifiably prioritized over another? Through an interrogation into these meta-interpretive questions, legal hermeneutics serves the critical role of helping the interpreter of laws reach a higher level of self-reflexivity about the interpretive process. From a legal hermeneutical point of view, it is primarily through this heightened transparency about the process of interpretation that better meaning assessments are generated.

Some distinctive features of legal hermeneutics are (1) it is rooted in philosophical hermeneutics; (2) within the schema of mainstream philosophies of law, it is most closely conceptually related to legal interpretivism; (3) it shares an antifoundationalist sensibility with many alternative theories of law; and (4) within jurisprudence proper (legal theory), its primary substantive focus is on the debate in constitutional theory between the interpretive methods of originalism and non-originalism.

Table of Contents

  1. Roots: Philosophical Hermeneutics
  2. Legal Hermeneutics and Mainstream Philosophy of Law
  3. Legal Hermeneutics and Alternative Theories of Law
  4. Legal Hermeneutics in Jurisprudence Proper
  5. Conclusion
  6. References and Further Reading
    1. References
    2. Further Reading

1. Roots: Philosophical Hermeneutics

The term hermeneutics can be traced back at least as far as Ancient Greece. David Hoy traces the origin of term to the Greek god Hermes, who was, among other things, the inventor of language and an interpreter between the gods and humanity. In addition, the Greek term ἑρμηνεύω, or hermeneutice, is central to Aristotle’s On Interpretation (Περὶ Ἑρμηνείαςas), which concerns the relationship between language and logic and meaning.

Hermeneutical approaches to meaning are thematized and utilized in many academic disciplines: archaeology, architecture, environmental studies, international relations, political theory, psychology, religion, and sociology. Specifically philosophical hermeneutics is unique in that rather than taking a particular approach to meaning, it is concerned with the nature of meaning, understanding, or interpretation.

Legal hermeneutics is rooted in philosophical hermeneutics, which asks not only the question of how best to interpret a given text, but also the deeper question of what it means to interpret a text at all. In other words, philosophical hermeneutics takes as its object of inquiry the interpretive process itself and seeks interpretive practices designed to respect that process (Dostal 2002; Malpas 2014; Wachterhauser 1994). Philosophical hermeneutics, then, can be alternately described as the philosophy of interpretation, the philosophy of understanding, or the philosophy of meaning.  The central problem of philosophical hermeneutics is how to successfully ascertain anything on the order of an objective interpretation, understanding, or meaning in light of the apparent fact that all meaning is ascertained through the filter of at least one interpreter’s subjectivity (Bleicher 1980: 1). Philosophical hermeneutics seeks transparency in the interpretive process en route to better meaning determinations. On this view, better theories of interpretation (1) capture the key features of the interpretive process, (2) recognize each act of understanding as an interpretation, and (3) are able to distinguish between more and less legitimate or objective interpretations, understandings, or meanings.

Philosophical hermeneutics has its theoretical origins in the work of 19th century German philologist Friedrich Ast. Ast’s Basic Elements of Grammar, Hermeneutics, and Criticism (Grundlinien der Grammatik, Hermeneutik und Kritik) of 1808 contains an early articulation of the main components of what later became known as the hermeneutic circle. Ast wrote that the basic principle of all understanding was a cyclical process of coming to understand the parts through the whole and the whole through the parts.  This basic principle derived, for Ast, from the “original unity of all being” (Ast 1808: Section 72) or what Ast called spirit or Geist. (Ast’s Geist is commonly understood to have been derived from Herder’s concept of Volkgeist.)

To understand a text, for Ast, was to determine its inner meaning or spirit, its own internal development, through a circularity of reason, a dialectical relation between the parts of a given work and the whole (Ast 1808: Section 76).  What Ast called the hermeneutic of the spirit involved, in turn, the development of an understanding of the spirit of the writer and her era and an attempt to identify the one idea, or Grundidee, that unified a given text and that provided clarification regarding the relationship of the whole to the parts and the parts to the whole. In this process, for Ast, it was incumbent upon the interpreter to always remain cognizant of the historical period in which the text was situated.

Friedrich August Wolf was a contemporary of Ast’s and a fellow philologist. His Lecture on the Encyclopedia of Classical Studies (Vorlesung über die Enzyklopädie der Altertumswissenschaft) of 1831 defined hermeneutics as the science of the rules by which the meaning of signs is determined. These rules pointed, for Wolf, to a knowledge of human nature. Both historical and linguistic facts have a proper role in the interpretive process, according to Wolf, and help us to understand the organic whole that is the text. For Wolf, however, the primary task of hermeneutics was not the identification of the Grundidee or focal point of the text à la Ast, but the much more practical goal of the achievement of a high level of communication or dialogue between the interpreter of the text and the author, as well as between the interpreter and those to whom the text is to be explained.

Although aspects of the hermeneutics of both Ast and Wolf have survived into contemporary philosophical hermeneutics, the hermeneutics of both are generally understood to be concerned with what later became known as regional hermeneutics, or hermeneutics applicable to specific fields of study. Friedrich D. E. Schleiermacher, by contrast, was the first to define hermeneutics as the art of understanding itself, irrespective of field of study (Palmer 1969: 84).  Underlying and grounding the specific rules of interpretation of the various fields of study, for Schleiermacher, was a unity grounded in the fact that all interpretation takes place in language (bid.). Schleirmacher thought that a general, rather than regional, hermeneutics was possible and that such a general hermeneutics would consist of the principles for the understanding of language (Ibid.). Specifically, for Schleiermacher, proper interpretation, or understanding, was not merely a function of grasping the thoughts of the author, but of coming to grips with the extent to which the language in which the thoughts took place affected, constrained, and informed those thoughts. Schleiermacher, then, is calling our philosophical attention to the fact that when we say we understand something, we are essentially just comparing it to something we already know, most basically a given language. Here, to understand is to place something within a pre-existing context of intelligibility. Understanding, for Schleiermacher, was therefore decidedly circular, but for him this did not amount to the conclusion that understanding was impossible. Instead, circularity is how understanding is defined.  Understanding necessarily and structurally entails that the text and the interpreter share the same language and the same context of intelligibility.

Wilhelm Dilthey continued Schleiermacher’s pursuit of understanding qua understanding, but he sought to do so within the specific context of what he called the human sciences, or the Geisteswissenschaften (Dilthey 1883). The methods of scientific knowledge, for Dilthey, were too “reductionist and mechanistic” to capture the fullness of human-created phenomena (Palmer 1969, 100). The human sciences, or humanities, required instead two particular processes: (1) the development of an appreciation for the role of historical consciousness in our conceptions of meaning, and (2) a recognition that human-created phenomena are generated from “life itself” rather than through theory or concepts (Ibid.). In contemporary hermeneutic theory, the first process is often referred to as the historicality, or Geschichtlichkeit, of meaning and the second as life-philosophy, or Lebensphilosophie, the phenomenological view that meaning can be only be generated and understanding can only be had through lived experience (Erlebnis) and not through the examination of concepts, theories, or other purely idealistic or rational methods (Nenon 1995).

While Dilthey observed that the categorical methods of understanding useful in science were inappropriate for use in the human sciences, Martin Heidegger switched the entire hermeneutic enterprise from an epistemological focus to an ontological one. This switch is customarily referred to as the ontological turn in hermeneutics (Kisiel 1993; Tugendhat 1992). For Heidegger, in his classic, Being and Time (1962/2008), the question of the nature of understanding, or Verstehen, could only be answered by first answering the question of the nature of what it means to be.  Accordingly, Heidegger set out in Being and Time to discover the nature of being qua being. To do so Heidegger went to the things themselves, or die Sachen selbst, in keeping with the phenomenological methodology he learned from his teacher, Edmund Husserl. Heidegger called his phenomenological inquiry into the nature of being qua being fundamental ontology. He also called it hermeneutic ontology, which highlights that, for Heidegger, being and interpretation are inextricably linked almost to the point of identity.

The idea that, for Heidegger, being and interpretation were virtually the same phenomenon is arguably best captured in two of Heidegger’s key concepts: Dasein and being-in-the-world. Dasein can be roughly translated as the human way of being, but its literal translation is “being there” or “being here.” With these concepts, Heidegger is attempting to stress that the human way of being is interactive both with one’s environment and with others in the world. To be human is to be active and involved in one’s world and with other people rather than to be in a particular static state. There are no isolated human subjects separate from the world, for Heidegger, and the human way of being is not adequately characterized by the traditional philosophical distinction between subject and object, or by the distinction between subject and other subjects (or minds and other minds, as this polarity is sometimes described) that originates, for Heidegger, in Descartes’s Meditations. Instead, being, for humans, is being-in-the-world, a term meant to highlight the lack of clear barriers between human beings and the contexts, or schemas of intelligibility, in which they find themselves (Dreyfus & Wrathall 2002). According to Heidegger, what this means for the phenomenon of understanding is that it is always a function of how a given human being is in the world, that is, a function of context. The relationship between being, or context, and understanding is reciprocal. Understanding, for Heidegger, discloses to us what it means to be, and who we are affects how we understand things. In other words, understanding, for Heidegger, is not a sort of apprehension of the way things really are, as the canonical, modern, philosophical tradition might think of it, but rather it is the process of appreciating the manner in which things are there for a particular person, or group of persons, in the world. Further, the manner in which things are there for us in the world is a function primarily of shared social and cultural practices. To understand something, then, is to be able to place it within a schema of intelligibility, which is generated by the shared social and cultural practices in which one finds oneself (Dreyfus & Wrathall 2005).

In his Truth and Method (1975), Hans-George Gadamer picks up on Heidegger’s concept of the hermeneutic circle of understanding that is at the core of what it means to be human in the world, but while it is true that Gadamer works within the Heideggerian paradigm to the extent that he fully accepts the ontological turn in hermeneutics, Gadamer’s own stated project in Truth and Method is to get at the question of understanding qua understanding. Specifically, Gadamer observes that the traditional paths to truth are wrong-headed and run antithetical to the reality that being and interpretive understanding are intertwined. In the traditional paths to truth, truth and method are at odds. The methods used in the Western tradition will not get us to truth. These methods are critical interpretation, or traditional hermeneutics, and the Enlightenment focus on reason as the path to truth. Both of these methods have what Gadamer calls a pre-judgment against pre-judgment. That is, they both fail to acknowledge the role of the interpreter in determining truth. Traditional critical interpretation is inadequate because it seeks original intent or original meaning, that is, it holds on to the fiction that the meaning of the text can be found in the original intent of the author or in the words of the text. The Enlightenment focus on reason is an equally inadequate path to truth because it retains the subject/object distinction and thinks the path to truth is through the scientific method, both of which are wrong-headed.

For Gadamer, the word pre-judgment, or Vorurteile means the same thing as Heidegger’s fore-structure of understanding. Gadamer claims that today’s negative connotation of pre-judgment only develops with the Enlightenment (Schmidt 2006: 100). The original meaning of pre-judgment, according to Gadamer, was neither positive nor negative, but simply a view we hold, either consciously or unconsciously. All understanding necessarily starts with pre-judgments. The pre-judgments of the interpreter, for Gadamer, rather than being a barrier to truth, actually facilitate its generation. The pre-judgments of the interpreter—held as a result of the interpreter’s personal facticity—not only contribute to the generation of the question being raised in the first instance, but, if taken into account on the path to truth, are capable of being critically evaluated and revised, with the result that the quality of the interpretation is improved. Additionally, pre-judgments are either legitimate or illegitimate. Legitimate pre-judgments lead to understanding. Illegitimate pre-judgments do not. One of the goals of Truth and Method is to provide a theoretically sound basis upon which to distinguish between legitimate and illegitimate pre-judgments (Schmidt 2006: 102). Understanding or meaning, for Gadamer, is a function of legitimate pre-judgments.

The model for how understanding actually operates, for Gadamer, is the conversation or dialogue. In an authentic dialogue, says Gadamer, understanding or meaning is something that occurs inside of a tradition, which is just a set of cultural assumptions and beliefs. A tradition is a worldview, or Weltanschauung, a system of intelligibility, a framework of ideas and beliefs through which a given culture experiences and interprets the world. A tradition, in this Gadamerian sense, is the theoretical grandchild of what Ast called a given text’s Grundidee, or one idea that unified it. For Gadamer, a legitimate pre-judgment is a pre-judgment that survives throughout time, eventually becoming a central part of a given culture, a part of its tradition. Understanding or meaning is an event, a happening, the substance of which is a fusion of this narrowly defined concept of tradition and the pre-judgments of the interpreter. In this sense, understanding is not willed by the participants. If it were, the dialogue would not be authentic and understanding or meaning could never be achieved. Instead, the conversation or dialogue wills the path to understanding. The thing itself reveals the truth.

In the course of the dialogue, and as a direct and organic result of the things being discussed by the particular participants of the conversation, a question arises. This question becomes the matter at hand, the topic of the conversation. As the conversation proceeds, the answer will show up as well, and it will be a function of the “fusion of horizons” between the perspectives or pre-judgments of the participants of the conversation (Gadamer 1975). This fusion is understanding/meaning. It is the answer to the question and the closest thing there is to truth. In this way, both the things themselves and the participants of the conversation together generate both the conversation topic (the question) and the answer. Together, the things and the participants of the conversation generate the truth of the matter. Moreover, all of this takes place within a tradition that gives legitimacy and weight to the meaning generated.

It is important, for Gadamer, that the path to truth is phenomenological, that is, we must go to the things themselves, and the path is also hermeneutic in that it appreciates that the pre-judgment against pre-judgment is unavoidable. Every interpreter arrives at a text with what Gadamer calls a given horizon, or conglomeration of pre-judgments, which is analogous to a Heideggerian world or fore-structure of understanding and which has been described as a given schema of intelligibility in which an interpreter finds himself or herself. A Gadamerian horizon is a shared system of social and cultural practices that provides the scope of what shows up as meaningful for an interpreter as well as for how things show up. Picking up on the hermeneutic circle, Gadamer holds that an act of understanding is always interpretive.

Another key element of Gadamerian philosophical hermeneutics is Gadamer’s insistence that interpretation, understanding, or meaning cannot take place outside of practical application. Interpretation is more than mere explication for Gadamer. It is more than mere exegesis.  Beyond these things, interpretation of a given text—and it is important that everything is a text—always and necessarily takes place through the lens of present concerns and interests. The interpreter always and necessarily¸ in other words, comes to the table of the interpretive conversation or dialogue with a present concern that is grounded in a given epistemological or metaphysical horizon in which the interpreter dwells. In this way, for Gadamer, Aristotle got it right that understanding necessarily occurs through practical reasoning, or phronesis. For Gadamer, “[a]pplication does not mean first understanding a given universal in itself and then afterward applying it to a concrete case. It is the very understanding of the universal….itself” (Gadamer 1975). That phronesis is central to Gadamer’s hermeneutics is not disputed (Arthos 2014).

But, even more important than this, for Gadamer, the distance in time between the interpreter and the text is not a barrier to understanding but that which enables it. Temporal distance between text and interpretation is a “positive and productive condition enabling understanding” (Gadamer 1975).  When we seek to interpret a text, we are trying to figure out not the author’s original intent but “what the text has to say to us” (Schmidt 2006: 104), and this is a function of the extent to which the author’s original intent and the meaning generated by the contemporary context and the contemporary interpreter agree, that is, the extent to which the horizons of the author and the instant interpreter fuse or blend. (Gadamer specifically discusses legal hermeneutics in Truth and Method. He writes that there are two commonly understood ways of determining meaning in the law. The first is when a judge decides a case. In such a scenario, the judge must necessarily factor the present facts into the decision. The second is the case of the legal historian. In this second scenario, although it may seem as if the task is to discover the meaning of the law by only considering the history of the law, the reality is that it is impossible for the legal historian to understand the law solely in terms of its historical origin to the exclusion of considerations of the continuing effect of the law. In other words, determinations of meaning in the law, as is the case of all determinations of meaning, necessarily and at all times involves practical application.)

Post-Gadamerian philosophical hermeneutics takes many forms but can arguably be said to begin with the work of Emilio Betti. Finding what he saw as an epistemological relativism in the philosophical hermeneutics of Gadamer, Betti returns to the general hermeneutics of Schleiermacher and Dilthey and resists the tide of the ontological turn (Pinton 1972, 1973). Betti was a legal theorist who tried to bring the hermeneutic project back to one of interpretation without reference to the human way of being. Betti believed in and sought objective understanding or objective interpretation, or Auslegung, while at the same time stressing that texts reflected human intentions. Accordingly, he thought it was possible to ascertain the meaning of the text through replicating the original creative process, the train of thought, so to speak, of the text’s author. Betti believed in the autonomy of the text (Bleicher 1980: 58). Objective interpretation was possible, for him, but this objectivity was based both in terms of a priori epistemological existence à la Plato’s forms and of historical and cultural coherence. (Bleicher 1980: 28-29).

Jürgen Habermas, like Emilio Betti, seeks objective understanding (Habermas 1971), but, unlike Betti and in agreement with Gadamer, Habermas believes that hermeneutics is not and cannot be merely a matter of trying to find the best method of interpretation. Instead, objectivity of interpretation is grounded in something Habermas called communicative action, a sort of Gadamerian dialogue modified by the recognition that power imbalances often distort what passes for collective understanding, and that real consensus—the closest thing available to truth and/or objective understanding—can only be had where that consensus has been generated impartially and in circumstances where agreement has been unconstrained. (Habermas’s communicative action concept is also known as communicative praxis or communicative rationality.) While Gadamer’s philosophical hermeneutics grounded a kind of quasi-objectivity in the authority of tradition, however, Habermas found this approach insufficiently able to guide social liberation and progress. The task of hermeneutics is not merely to deconstruct the process of understanding and/or to somehow ground that understanding in either method à la Betti or tradition à la Gadamer, but to determine rules of ascertaining universal validity in the social sciences en route to social change. In this way, Habermas’s hermeneutics claims that hermeneutics can and does permit the kind of value judgments of which some critics say hermeneutics is incapable.

Paul Ricoeur was a contemporary philosophical hermeneutist who is known for creating what is often described as a critical hermeneutics. For Ricoeur, meaning and understanding are to be obtained through culture and narrative, as these take place in time. Influenced by Freud, Ricoeur thought all ideology required a critique to uncover repressed and hidden meanings that exist behind surface meanings that pass for truth.  In The Conflict of Interpretations (1974), Ricoeur argued that there were many and various paths to understanding and that each uniquely adds to meaning. Ricoeur’s work has been taken up recently by phenomenologists interested in questions of the nature of paradox (Geniusas 2015).

The work of Jacques Derrida (Derrida 1976, 1978) is more commonly associated with a 20th century movement in French philosophy known as deconstruction than with philosophical hermeneutics per se. However, there are important similarities between the two movements. First, deconstruction on its own terms, like hermeneutics, is not a method. Instead, deconstruction is a critique of authoritative systems of intelligibility or meaning that exposes the hierarchies of power within those systems. In understanding itself as outside of existing theoretical schemas, in other words, Derrida’s deconstruction is within the hermeneutic tradition. Second, deconstruction is based on Heidegger’s concept of Destruktion, a central concept in his hermeneutic ontology. But, while Heidegger’s Destrucktion, a project of critiquing authoritative systems of meaning that are based on structures of foundationalist metaphysics or epistemology, concludes that every act of understanding is an act of interpretation (Heidegger, 2008/1962), Derrida’s deconstruction involves identifying that language, or text, contains conceptual oppositions that involve the prioritizing of one side of a given conceptual opposition over the other, for example, writing over speech. Still, Derrida’s deconstruction is clearly in the hermeneutic tradition in that it is designed to highlight the elliptical and enigmatic nature of language and meaning. This is particularly evident in Derrida’s concept of différance, according to which every word in a given language implicates other words, which implicate other words, in a process of infinite reference and therefore what Derrida calls absence, meaning an absence of definitive meaning.

Susan-Judith Hoffman argues that Gadamerian hermeneutics furthers feminist objectives and can be understood as a form of feminist theorizing. Highlighting Gadamer’s account of the importance of difference, his notion of understanding as an inclusive dialogue, his account of pre-judgments as conditions for understanding that must always remain provisional, his account of tradition as that which is transformed by our reflection, and his account of language (Hoffman, 2003: 103), Hoffman argues that Gadamer’s philosophical hermeneutics is in line with feminist theorizing in that it “overthrows the false universalism of the natural sciences as the privileged model of human understanding” (Ibid.: 81).  In the process, Gadamer’s hermeneutics amounts to feminist theorizing in two important ways. First, it contains a sensitivity to the historical and cultural situation of knowledge and knowledge seekers, and second, it contains the critical power to challenge reductive universalizing tendencies in traditional canons of thought (Ibid.: 82).

Linda Martín Alcoff also sees value for feminist theory in Gadamer’s hermeneutics. For Alcoff, Gadamer’s “openness to alterity,” his “move from knowledge to understanding,” and his “holism in justification and immanent realism” all align themselves with feminist theorizing (Alcoff 2003: 256). That Gadamer’s philosophical hermeneutics contained these elements was insisted upon by Gadamer himself, who saw his philosophical hermeneutics as a critique of the Enlightenment view that truth could be had through abstract reasoning, divorced from historical considerations, as well as a call for the acknowledgment that the path to truth was through the particular rather than through the universal (Gadamer 1975). Miranda Fricker has recently developed hermeneutical themes into what she calls hermeneutical injustice, according to which an injustice is done when the collective hermeneutical resources available to a given individual or group are inadequate for expressing one or more important areas of their experience (Fricker 2007).

The work of Donatella di Cesare, Günter Figal, and James Risser is at the forefront of contemporary hermeneutics. For Cesare, the ground of hermeneutics is in Heideggerian existentialism, but this does not mean that hermeneutics is a kind of nihilism. Instead, hermeneutics, or the philosophy of understanding, is aimed at consensus; it is a constructive enterprise (Cesare 2005). For Figal, hermeneutics is most fundamentally a critique of objectivity and a call to understand things previously understood as objective elements of human life (Figal 2010). Risser has been interpreted as attempting to advance beyond Gadamer’s philosophical hermeneutics by acknowledging the radical finitude at stake in the phenomenon of tradition (George 2014).

2. Legal Hermeneutics and Mainstream Philosophy of Law

Within mainstream philosophy of law, legal hermeneutics is most closely aligned with legal interpretivism. Legal interpretivism is conceptually positioned between the two main subfields of philosophy of law: legal positivism and natural law theory. While mainstream philosophy law has many faces, and includes, among other theories, legal realism, legal formalism, legal pragmatism, and legal process theory, legal positivism and natural law theory form the theoretical poles between which each of the mainstream theories can be understood to lie. Legal positivism is the view, in broad strokes, that there is no necessary connection between law and morality and that law owes neither its legitimacy nor its authority to moral considerations (Feinberg and Coleman 2008; Patterson 2003). The validity of law, for the legal positivist, is determined not by its moral content but by certain social facts (Hart 1958, 1961; Dickson 2001; Coleman 2001; Gardner 2001). Natural law theory is grounded in the work of two main thinkers: John Finnis and Lon Fuller. For Finnis, an unjust law has no authority (Finnis 1969; 1980; 1991), and for Fuller, an immoral law is no law at all (Fuller 1958). Natural law theory, generally speaking, then, is the view that there is a necessary connection between law and morality and that an immoral law is not a law (Raz 2009; Simmonds 2007; Murphy 2006).

Sometimes called a third, main theory of law, legal interpretivism, developed by Ronald Dworkin, is the view that the law is essentially interpretive in nature and that it gains is authority and legitimacy from legal principles. Dworkin understands these principles to be neither bare rules nor moral tenets, but a set of guidelines to interpretation that are generated from legal practice (Dworkin 2011, 1996, 1986, 1985, 1983). Some describe legal interpretivism as a hybrid between legal positivism and natural law theory for the reason that Dworkin’s principles seem to qualify both as rules and to have a kind of normative quality that is similar to moral tenets (Hiley et al. eds. 1991; Brink 2001; Burley 2004; Greenberg 2004). But, what distinguishes legal interpretivism from both legal positivism and natural law theory is its insistence that legal meaning is tempered by the legal tradition within which it operates (Greenberg 2004; Hershovitz 2006). For the legal interpretivist, in other words, the line between legal positivism and natural law theory is not clearly drawn. Instead, rules and normative guidelines together shape and form both what the law is and what it means. This approach to legal ontology and meaning is known as the interpretive turn in analytic jurisprudence (West 2000; Feldman 1991).

Arguably, however, what legal positivism, natural law theory, and legal interpretivism all have in common is epistemological and metaphysical foundationalism. While, for the legal positivist, the answer to both the question of what the law is and the question of what the law means can be found in rules (Hart 1958), for the natural law theorist, the answer to both questions can be found in morality (Fuller 1958).  Similarly, for the legal interpretivist, the answer to both questions is found in legal principles.  In other words, for the legal interpretivist, law gains its legitimacy and authority from principles emanating from legal practice. Although law is interpretive in nature, on this view, the interpretative process stops at the point at which a judgment call has to be made as to what the/a law means, preferably by someone well-versed in the relevant legal tradition. Once that judgment call is made, we have our answer. Meaning has been determined.

The legal hermeneutical approach is similar but different in the important respect that no meaning determination is ever understood to be fixed. As is the case for the legal interpretivist, for the legal hermeneutist, law is interpretive in nature, but at no point can any meaning determination rise to the level of definitive. Things like Dworkinian principles are acknowledged and considered, along with myriad other factors relevant to good interpretive practice, but the story of the meaning of the law, for the legal hermeneutist, most certainly cannot end at any point. There is no foundation to the law, for the legal hermeneutist, and there can be none. Instead, there can only be better or worse interpretations, measured comparatively and by the quality of the interpretive practices used to generate the various interpretations. More importantly, however, for the legal hermeneutist, objective interpretation simply is not and cannot be the project. Instead, the search for legal meaning is a critical project. The search for legal meaning involves critical engagement with previous interpretations and the current interpretation and includes critical analysis of the conditions for the possibility for both.

Legal hermeneutics, then, while similar to legal interpretivism in many respects, provides an alternative to the three main theories of law in that its approach to legal meaning can be understood to avoid engagement with the question of foundationalism that is characteristic of the traditional approaches.  Rather than offering a new theory of law, legal hermeneutics “provides us with the necessary protocols for determining meaning” (Douzinas, Warrington, and McVeigh 1992: 30). Legal hermeneutics provides no specific theory of law and privileges no particular methodology or ideology. Instead, legal hermeneutics calls the interpreter of legal texts first and foremost to the fact that every act of understanding a law is an act of interpretation, and at the same time, highlights that better interpretation takes conscious and proactive account of what philosophical hermeneutics, as described above, reveals as the necessary structures and components of the interpretive process. Some might describe this feature of legal hermeneutics as taking the determinacy of meaning to be context-dependent and open-ended. While this account is on track, another key feature of legal hermeneutics is that it is a descriptive rather than a normative project. Legal hermeneutics, then, is more a way of clarifying the nature of how legal interpretation actually works than a theory of how legal interpretation ought to work. In this way, legal hermeneutics can be understood to provide the tools with which to investigate, clarify, and help solve what appear from other perspectives to be insoluble legal problems, particularly problems based in conflicts of interpretation.

3. Legal Hermeneutics and Alternative Theories of Law

Legal hermeneutics shares an antifoundationalist sensibility with many alternative theories of law, including the critical legal studies movement, Marxist legal theory, deconstructionist legal theory, postmodernist legal theory, outsider jurisprudence, and the law and literature movement. For each of these theories of law, the goal of locating law’s ultimate legitimacy, authority, or meaning anywhere at all is understood as an exercise in futility. Some characterize this feature of these theories as the failure of complete determinacy as a semantic thesis, rather than as a failure of ultimate justification as an epistemological thesis. However, for others, this distinction is not meaningful and fails to adequately account for the radical rejection of the entire project of justification inherent in alternative theories.

The critical legal studies movement was an intellectual movement in the late 1970s and early 1980s that stood for the proposition that there is radical indeterminacy in the law. Conceptually based in the critical theory of the Frankfurt School, critical legal studies stands for the proposition that legal doctrine is an empty shell.  There is no such thing as the law, for the critical legal theorist, as the law is understood as an entity that exists out of context (Binder 1996/1999: 282). Instead, law is produced by power differentials that have their origins in differences in levels of property ownership. The liberal ideal of the rule of law devoid of influence from power differentials, contained in all analytic approaches to jurisprudence, is an illusion. For this reason, law is inherently self-contradictory and self-defeating and can never be a mere formality, as liberal theory and analytic jurisprudence would have us believe. This way of understanding the law is known as the indeterminacy thesis. For some, this does not necessarily mean that law is indeterminable. However, it means that determinability is context-dependent. Others do not find this distinction meaningful.

Marxist legal theory begins with the work of Evgeny Pashukanis and takes place in contemporary form in the work of Alan Hunt, among others. For Pashukanis, law was inextricably linked to capitalism and hopelessly bourgeois. Outside of capitalism, things like legal rights are unnecessary, since outside of capitalism there are no conflicting interests or rights to be meted out or over which it is necessary for persons to fight. In the socialist society that Pashukanis envisions on the other side of capitalism, what would take the place of law and all talk of individual rights would be a sort of quasi-utilitarianism that values collective satisfaction over the perceived need to protect the individual interests of individual legal subjects (Pashukanis 1924).  What contemporary Marxist legal theory retains from Pashukanis is the view that law is inescapably political, merely one form of politics. In this way, law is always potentially coercive and expressive of prevailing economic relations, and the content of law always manifests the interests of the dominant class (Hunt 1996, 1999: 355). So described, the content of law, for Marxist legal theorists, has no theoretical or practical basis in anything epistemologically foundational or universal.

Deconstructionist legal theories can be considered post-structuralist like critical legal studies but are unique in that they center around conceptual oppositions or binary concepts, also known as binaries. According to the deconstructionist approach, within a given conceptual opposition, one term in the opposition has been traditionally privileged over the other in a particular context, or text. A text can be a written text, an argument, a historical tradition, or a social practice. Jacques Derrida, considered the forerunner of deconstruction as a philosophy of language and meaning, famously identified a conceptual opposition between writing and speech, for example, with writing being the privileged form (Derrida 1976). Privileged, in deconstruction, means truer, more valuable, more important, or more universal than the opposing term (Balkin, 1996, 1999: 368). According to deconstructionist theories of law, legal distinctions are often masked conceptual oppositions taht privilege one term over another. For example, individualism is privileged over altruism, and universalizability is privileged over the attention to the particular that is an inherent part of equitable distribution. These binary concepts and the privileging of one term in each binary lend an instability to the law, on deconstructionist terms, that is decidedly anti-foundationalist. J.M. Balkin, for example, argues that the true nature of the legal subject is ignored and obliterated by conventional legal theory (Balkin 2010; 1993). Balkin argues that when an attempt is made to understand a law, we bring our subjective experiences to bear on that attempted understanding (Ibid.). For Balkin, mainstream philosophy of law’s failure to acknowledge that this is the case is its very deep and abiding flaw.

Postmodernist legal theories are grounded in a 20th century movement in aesthetic and intellectual thought, which departed from interpretation based in universal truths, essences, and foundations. Postmodern legal theory departs from a belief in the rule of law, or any generalized or universalizable Grand Theory of Jurisprudence, in favor of using “local, small-scale problem-solving strategies to raise new questions about the relation of law, politics, and culture” (Minda 1995: 3). Other than this statement, it is difficult to describe postmodernist legal theory in any general way, since the entire point of postmodernist legal theory is that generalized theories are vacuous, even impossible. Instead, there are only individual theories, individual authors of theories, and individual texts/laws. It is fair to say, however, that postmodern legal theorists generally resist the sort of conceptual theorization routinely practiced by more mainstream legal academics and analytic philosophers for the reason that more mainstream approaches unduly emphasize abstract theory at the expense of pragmatic concerns (Ibid.). The postmodern rejection of ultimate theories can be construed as a form of antifoundationalism.

Outsider jurisprudence is an area of legal theorizing that is highly skeptical of the ability of mainstream legal theory to address the needs of members of historically marginalized groups. Although there has been a proliferation of kinds of outsider jurisprudence in the early 21st century, including LatCrit and QueerCrit (Mahmud 2014; Valdes 2003; Eskridge 1994), there are two main kinds of outsider jurisprudence: critical race theory and feminist jurisprudence (Parks 2008; Jones 2002; Delgado 2012; Levit and Verchick 2006). Critical race theorists are concerned with the particularized experiences of African Americans in American jurisprudence. They share with the postmodernists a rejection of the idea of the existence of one grand and universally applicable theory of law that applies equally to everyone: “There is a hidden category of persons to whom the laws do not equally and universally apply, for the critical race theorists, and that category of persons is African Americans” (Minda 1995: 167).  Key themes in critical race theory are a call to contextualized theorizing about the law that acknowledges that the lives and experiences of African Americans in America have a juridical tenor very different from the lives and experiences of other Americans, a critique of political liberalism, which bases its apportionment of rights on the fiction that African Americans as a group have the same degree of access to rights, in American society, as other Americans, and a call for juridical acknowledgment of the persistence of racism in American society (Ladson-Billings 2011; Whyte 2005; Delgado 1995: xv).

Feminist jurisprudence “[goes] beyond rules and precedents to explore the deeper structures of the law” (Chamallas 2003: xix). It operates under the belief that gender is a significant factor in American life and explores the ways in which gender, and related power dynamics between men and women throughout American legal history, have affected how American law has developed (Ibid.). Feminist jurisprudence concerns itself with legal issues of particular significance to women, such as sexual harassment, domestic violence, and pay equity. It also approaches legal theory in a way that comports with many women’s lived experiences, that is, without pretending, as mainstream jurisprudence tends to do, that gender is irrelevant to the outcome of legal disputes (Ibid.). Of primary concern to feminist legal scholars is the systemic nature of women’s inequality and the pervasiveness of female subordination through law in America. The methodology of feminist jurisprudence is the excavation and examination of hidden legalized mechanisms of discrimination to uncover hierarchies in law that operate to the detriment of the ideal of equal rights for women (Ibid.). The feminist legal scholar’s identification of hidden power dynamics at work in American law can be construed as yet another antifoundationalist perspective on law.

A recent development in outsider jurisprudence is intersectionality theory, or the idea that oppression takes place across multiple, intersecting systems, or axes, of oppression (Cho, Crenshaw, and McCall 2013; MacKinnon 2013; Walby 2007). Intersectionality theory is grounded in the thought of Kimberlé Crenshaw (Crenshaw 1989, 1991) and reinforces the idea from critical race theory and feminist jurisprudence that law operates differently on the bodies of the oppressed. For Crenshaw, race and gender discrimination combine on the bodies of black women in a way that neither race discrimination nor gender discrimination alone capture or are able to capture or handle. Crenshaw’s point is that ignoring race when taking up gender reinforces the oppression of people of color, and anti-racist perspectives that ignore patriarchy reinforce the oppression of women (Crenshaw 1991, 1252). But, more specifically, taking up any form of oppression in a vacuum ignores the way that oppression actually works in the lives of the oppressed. For the law to help combat oppression, it must grapple with the complexities and nuances of lived experience.

Containing very similar themes to legal hermeneutics is what is known as the law and literature movement (Fish 1999; Rorty 2007, 2000, 1998, 1991, 1979; Bruns 1992; Fiss 1982). The law and literature movement, like certain forms of legal hermeneutics, is heavily influenced by the deconstructionist philosophy of Jacques Derrida (Derrida 1990, 1992). The literary legal theorist, in other words, has developed an appreciation for the costs of excluding certain types of questions from the process of ascertaining meaning in the law (Levinson and Mailloux 1988: xi). Moreover, there is an active attempt on the part of the literary legal theorist to dismantle or undo the conventional illusion that the structures that support claims to authentic, legitimate, or official meaning are built on solid ground. The role of the interpreter is also highlighted in these approaches, as is the inextricability of determinations of meaning from the power dynamics in which they take place (Thorsteinsson 2015; Surrette 1995).

4. Legal Hermeneutics in Jurisprudence Proper

Legal hermeneutics in jurisprudence proper, legal theory, can be traced back to the publication of Francis Lieber’s 19th century work, Legal and Political Hermeneutics (Lieber 2010/1880). There, Lieber tried to identify principles of legal interpretation that would bring consistency and objectivity to the interpretation of the U.S. Constitution, and at the same time exposed strict intentionalist interpretative methods—defined as those in which the so-called intent of the Framers had interpretive authority—as incoherent (Binder and Weisberg 2000: 48). More than 125 years after Lieber’s landmark text, contemporary legal hermeneutics is still trying to find that balance. Contemporary legal hermeneutics retains Lieber’s goal of objectivity of interpretation and his attention to the roles of history, temporality, politics, and socio-historical context in credible meaning assessments. The central question of legal hermeneutics in constitutional theory is: What sorts of interpretive methods should we use to come up with an interpretation of the constitution that approaches objectivity despite the fact that, owing to certain realities about how the interpretive process works, it is impossible for us to ascertain the intent of the Framers?

Another question at the core of legal hermeneutics, however, is: Even if we could ascertain the intent of the Framers, which all legal hermeneutists think is impossible, why would we want to do so, given the nature of what a constitution is—a living, breathing text designed to govern real people in real life contexts—and the fact that legal hermeneutical principles based in philosophical hermeneutics dictate that the particular time and place, that is, the context, of a given application of a given law significantly influences, and should influence, the content of the interpretation? This is an example of the hermeneutic circle at work in legal interpretation. That is, from the vantage point of legal hermeneutics: What the constitution means in a particular instance is importantly influenced by the context in which the interpretation is taking place, the application, and the context in which the interpretation is taking place, the application, is importantly influenced by what the constitution means in that same context.

The primary focus of contemporary legal hermeneutics is the debate in constitutional theory between the interpretive methods of originalism and non-originalism.  Originalism is the view, generally, that the meaning of the constitution is to be found by determining the original intent of the Framers, understood to be most prudently found in the text of the constitution itself (Scalia and Garner 2012; Calabresi 2007; Monaghan 2004). By contrast, non-originalism is the view, generally, that the constitution is a living, breathing document meant more as a set of guidelines for future lawmakers than as a strict rulebook demanding literal compliance (Cross 2013; Balkin 2011; Goodwin 2010).  For clarification purposes, it should be noted that the divide between originalism and non-originalism is akin to the divide between epistemological foundationalism and antifoundationalism.

Within the debate between originalists and non-originalists, clearly all legal hermeneutists are necessarily non-originalists, since by the basic tenets of legal hermeneutics, original intent cannot be ascertained. But, what separates the legal hermeneutist from the average non-originalist is a high degree of respect for the text of the constitution as an interpretive starting point, together with a call to heightened self-reflexivity regarding the degree to which one’s own pre-judgments, and the pre-judgments of previous interpreters, may be affecting the interpretive process. By the same token, just as the legal interpretivist is constrained by the principles of legal practice in the interpretive process, the legal hermeneutist is similarly constrained by the spirit of the text. Finally, while the goal of the average non-originalist is a definitive interpretation of the text, however at odds with the original intent of the Framers, the legal hermeneutist has the more modest goal of deconstructing the mosaic of considerations that went into previous interpretations in an effort to examine each tile of the mosaic, one by one, more in the service of understanding the text/law within a given context than in the service of producing anything on the order of a definitive interpretation for posterity.

Another way of thinking about legal hermeneutics, however, is to see it as neither originalist nor non-originalist, but orthogonal to the originalist/antioriginalist continuum. In other words, it is consistent with the themes of legal hermeneutics that it rejects the originalist/antioriginalist continuum itself as wrong-headed and unproductive. Indeed, legal hermeneutics rejects interpretive method altogether in favor of a call to an increased level of self-reflexivity on the part of the interpreter, a call that is meant to actively and consciously engage the interpreter in the interpretive process in a way that neither originalism nor non-originalism demands.

On the contemporary scene, George Taylor’s work in legal hermeneutics follows Ricoeur’s in philosophical hermeneutics. In his “Hermeneutics and Critique in Legal Practice,” Taylor argues that Ricoeur’s approach to hermeneutics gets it right when it attempts to mediate the difference between understanding and explanation (Taylor 2000: 1101 et seq.).  Understanding, on this view, is obtained through hermeneutic methods, but explanation is obtained through science.  Ricoeur, according to Taylor, sees the interpretive enterprise as containing both elements. The way Taylor sees it, Ricoeur’s emphasis on the narrative nature of meaning acknowledges the roles of both understanding and explanation in a successful interpretation (Taylor 2000: 1123). The usefulness of legal hermeneutics, for Taylor, is that it correctly identifies and brings to the forefront that there is explanation or fact in understanding or interpretation, and there is understanding or interpretation in explanation or fact, shedding a kind of glaring light on all understandings that might deny this reality. The goals of originalism, on this view, are simply impossible to reach.

Francis J. Mootz, III agrees with Taylor about the impossibility to ascertain the original meaning (Mootz 1994). Accordingly, instead of engaging in what he understands as the necessarily fruitless exercise of attempting to ascertain original meaning, Mootz argues, we should instead attempt to find the interpretation that “allows the text to be most fully realized in the present situation” (Mootz 1988: 605).

Georgia Warnke describes the interpretive turn in the study of justice as an abandonment of the attempt to discern universally valid principles of justice in favor of attempts to “articulate those principles of justice that are suitable for a particular culture and society” in light of that society’s culture and traditions, “the meanings of its social goods,” and its public values (Warnke 1993: 158). We would then appeal to hermeneutical standards of coherence to reject interpretations that fail to respect that culture or those traditions, or meanings (Ibid.). Such an approach, for Warnke, “[shifts] the emphasis from a conflict between two opposing rights…to a conflict between two interpretations of…actions and practices that are consonant with [a given culture’s] traditions and self-understandings” (Ibid.: 162).

For Gregory Leyh, legal hermeneutics reveals to us the political nature of every act of constitutional interpretation. This includes both originalist approaches to constitutional interpretation as well as non-originalist approaches. However, for Leyh, legal hermeneutics also provides us with some constructive lessons for improving the quality of our necessarily political acts of interpretation. Specifically, in “Toward a Constitutional Hermeneutics,” Leyh makes the case for a legal hermeneutics based in the philosophical hermeneutics of Hans-George Gadamer (Gadamer 1975) in which, as the self-understanding of the interpreters of legal texts is increased, the quality of the interpretation produced by those interpreters is increased (Leyh 1992). This self-understanding would include primarily an explicit acknowledgment of the role that history plays in the development of both understanding and meaning (Ibid.: 370), an explicit acknowledgment of the “irreducible conditions of all human knowing” (Ibid.: 371), and attentiveness to the kinds of issues characteristically associated with the interpretation of all texts, including legal texts (Ibid.). For Leyh, a call to the constitution’s original meaning, á la a standard originalist approach, for example, entails certain assumptions about historical understanding, for example, that it is fixed and identifiable by subsequent interpreters, which legal hermeneutics exposes as impossible. What constitutional theorists need, for Leyh, is not greater insight into the intent of the framers, for this is not obtainable, but deeper reflection on the issue of the conditions that make historical knowledge possible at all (Ibid.: 372). For Leyh, legal hermeneutics “sets for itself an ontological task, namely, that of identifying the ineluctable relationships between text and reader, past and present, that allow for understanding to occur in the first place” (Ibid.).

There are two key aspects to Leyh’s legal hermeneutics: (1) an appreciation for the role of language in understanding, which sharpens our awareness of the “historical structures constitutive of all knowledge,” and (2) a recognition of the “enabling character of our prejudgments and preconceptions as windows to the past” (Ibid.: 372). Taking these things into consideration, it is impossible, according to Leyh, for us to obtain an understanding of historical texts like the constitution without going through the language we use today and our present-day prejudgments and preconceptions, or what Hans-George Gadamer called our pre-judgments. For Leyh, all reason is historical, and there is a historicity to all inquiry (Ibid.: 375). “No text simply sits before us and announces its meaning,” Leyh writes (1988: 375).

Rather than understanding the historicity of all inquiry as an impediment between the contemporary interpreter and the text, however, Leyh suggests that this information should aid us in recognizing that reason “finds its expression only as it is applied concretely” (Ibid.). In other words, interpretation is always practical, it always occurs in a particular set of circumstances at a particular time and place, and it applies itself to a particular set of facts. An acknowledgement of this reality on the part of the interpreter, for Leyh, adds a level of awareness vis-à-vis the interpretive process that can only aid in making sound judgments of constitutional interpretation.

David Hoy’s take on legal hermeneutics involves a focused critique of the intentionalist position in constitutional theory, according to which the so-called intent of the framers is the ultimate authority on constitutional meaning (Hoy 1992). For Hoy, while the intentionalist believes that no interpretation is needed to locate the intent of the framers, the hermeneutist understands that the concept of intended meaning presupposes a prior understanding of meaning in a different sense of the word. The concept of an ambiguous sentence highlights this prior understanding of meaning. A given sentence can have two different meanings in this prior sense, Hoy explains, whether either or both of them were intended or not (Hoy 1992: 175). The hermeneutist acknowledges, in other words, according to Hoy, a difference between sentence meaning and speaker’s meaning. However, while the intentionalist incorrectly presumes that there are only two possible bases for a theory of meaning—intention and convention (Ibid.)—the hermeneutist understands that there can be no fact of the matter vis-à-vis sentence meaning. Hoy writes, “[Hermeneutics] acknowledges semantic complexity. It does not exclude questions about intention when these are relevant to interpretation, but it believes that since textual meaning is not reducible to intended meaning, there are many other kinds of questions that can be asked about texts” (Hoy 1992: 178).

At the same time, Hoy’s hermeneutics stands for the proposition that the traditional way law is practiced operates as a constraint on judicial discretion. It provides a schema of intelligibility in which a judge must necessarily decide a case. As Hoy indicates, using discretion to decide what the law means within the tradition of the practice of law is what judges do all the time. “Only when the judges know that the law entails one decision and they nevertheless decide something else could they be said to be rewriting,” writes Hoy (1992: 183), and the hermeneutic claim is that this is almost never the case. See also Hoy, David (1987) “Interpreting the Law: Hermeneutical and Poststructuralist Perspectives,” Southern California Law Review 58 (1985): 136-76 and “Dworkin’s Constructive Optimism v. Deconstructive Legal Nihilism,” Law and Philosophy 6 (1987): 321-56. If Hoy is right, then, as Leyh points out as well, there is no act of judicial interpretation that takes place without interpretation. Such a possibility is an illusion. Instead, all acts of understanding are acts of interpretation including originalist and/or intentionalist acts of understanding.

In the early 21st century, John T. Valauri argued that the new questions for legal hermeneutics are different from the ones of the late 1980s and early 1990s (Valauri 2010). For Valauri, the continuing significance of hermeneutics for legal theory is to help us sort through the varieties of originalism that compete for our allegiance in the aftermath of what he sees as a kind of unanimous consent to originalism’s legitimacy. In other words, for Valuari, the hermeneutical question is no longer whether originalism is valuable, but what kind of originalism is valuable (Ibid.). The remaining questions that need to be answered to help us sort through the varieties of originalism, for Valauri, are (1) whether the various forms of originalism share a common conception of understanding and interpretation, and (2) whether hermeneutics is a descriptive or normative practice. To address these questions, says Valauri, we need to “[recover]…the fundamental hermeneutical problem” (Gadamer 1975), which means focusing on three key hermeneutical paradigms: (1) the process of application, (2) Aristotle’s practical wisdom, and (3) a focus on the “Aristotelian face” of hermeneutics over the Heideggerian one (Valauri 2010).  The significance of paradigms (1) and (2) are self-explanatory and common to all forms of hermeneutics, legal and otherwise. By paradigm (3), Valauri hopes to recover legal hermeneutics from its Heideggerian-based, full scale rejection of method that many mainstream legal theorists find so unpalatable.

Drawing themes and seeking overlap between the various contemporary legal hermeneutists, a legal hermeneutical approach to constitutional theory can be understood as a call to the interpreter of the constitution to take into conscious consideration the following factors when engaged in constitutional interpretation: (1) the identity of the interpreter, of previous interpreters, and the original author, (2) the sociohistorical context in which the text was written and in which the interpretation is taking place, (3) the political climate at the time the text was written and in which the interpretation is taking place, (4) the extent to which the meaning of words and concepts relevant to the interpretation have changed or have not changed over time, (5) the particularity of experience of those affected by a given law, (6) the extent to which that experience is acknowledged or unacknowledged by previous interpretations, (7) the relationship between who the interpreter is, who the interpreter takes herself to be, and the kinds of interpretive choices the interpreter makes, (8) the necessary truth that original meaning is an illusion and cannot be ascertained, and (9) the extent to which one’s own pre-judgments enter into one’s attempt at ascertaining meaning. This final aspect adds a level of self-reflexivity to the interpretive enterprise that is understood to significantly improve the quality of the interpretation. In other words, from the vantage point of legal hermeneutics, the more that assumptions customarily unacknowledged in mainstream legal theory are excavated and examined, the greater the degree of legitimacy of the interpretation.

5. Conclusion

Legal hermeneutics is an approach to legal texts that understands that the legal text is always historically embedded and contextually informed so that it is impossible to understand the law simply as a product of reason and argument. Instead, meaning in the law takes place according practical, material, and context-dependent factors such as power, social relations, and other contingent considerations. As Gerald Bruns has put it:

Legal hermeneutics is what occurs in the give-and-take—the dialogue—between meaning and history. The historicality of the law means that its meaning is always supplemented whenever the law is understood. This understanding is always situated, always an answer to some unique question that needs deciding, and so is different from the understanding of the law in its original meaning, say, the understanding a legal historian would have in figuring the law in terms of the situation in which it was originally handed down. The historicality of the law means that its meaning is always supplemented whenever it is understood or interpreted. Supplementation always takes the form of self-understanding; that is, it is generated by the way we understand ourselves—how we see and judge ourselves—in light of the law. But, this self-understanding throws its light on the law in turn, allowing us to grasp the original meaning of the law in a new way. The present gives the past its point. (Bruns 1992)

This seems to mean, at a minimum, that every Supreme Court decision is an interpretation, which directly undermines all originalist approaches to constitutional theory.

The claim that every Supreme Court decision is an act of interpretation, however, is not a claim about the indeterminacy of meaning itself but a more modest claim about the impossibility of ascertaining original meaning. The difference between these two positions is subtle but important. While for the non-originalist, the possibility of authoritative meaning is an illusion, for the legal hermeneutist more and less authoritative meanings are possible and are a function of the interpreter’s taking conscious account of several key factors that inform and shape the interpretive process. Taking conscious account of each of these factors when attempting to interpret a given legal text lends to the interpretative process a sort of legitimacy and authority, the possibility of which most non-originalist positions deny.

Legal hermeneutics, then, can be understood as an anti-method in constitutional theory. As Gregory Leyh has identified, “[H]ermeneutics neither supplies a method for correctly reading texts nor underwrites an authoritative interpretation of any given text, legal or otherwise” (Leyh 1992: xvii). Instead, “the activity of questioning and adopting a suspicious attitude toward authority is at the heart of hermeneutical discourse. Hermeneutics involves confronting the aporias that face us, and it attempts to undermine, at least in partial ways, the calm assurances transmitted by the received views and legal orthodoxies” (Leyh 1992: Ibid.). Arguably, any approach to legal hermeneutics that rejects its distinctively critical enterprise, then, misses the point of legal hermeneutics entirely. As an approach to legal interpretation, it is necessarily, following Heidegger and Gadamer, a complete rejection of the gods of both truth and method in favor of a call to the interpreter of laws to cast an incisive and self-reflexive gaze on all that is called mainstream legal orthodoxy.

6. References and Further Reading

a. References

  • Alcoff, Linda Martín (2003) “Gadamer’s Feminist Epistemology” in L. Code (ed.) Feminist Interpretations of Hans-Georg Gadamer, University Park: Penn State University Press.
  • Aquinas, Thomas (1998) On Law, Morality, and Politics. Ed. William P. Baumgarth and Richard J. Regan Indianapolis: Hackett Publishing Co.
  • Aristotle (350 B.C.E./ 2000) On Interpretation. Trans. E.M. Edghill. Adelaide: University of Adelaide Library.
  • Arthos, John (2014) “What is Phronesis? Seven Hermeneutic Differences in Gadamer and Ricoeur,” Philosophy Today, 58(1): 53-66.
  • Ast, F. (1808) Grundlinien der Grammatik, Hermeneutik und Kritik. Landshut, Ger.: Jos. Thomann.
  • Balkin, J.M. (2011) Living Originalism. Cambridge: Belknap Press of Harvard University Press.
  • Balkin, J.M. (2010) “Deconstruction” in A Companion to Philosophy of Law and Legal Theory (Second Edition), Dennis Patterson, ed., Malden: Wiley-Blackwell.
  • Balkin, J.M. (1999, 1996) “Deconstruction” in A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 367-374.
  • Balkin, J.M. (1993) “Understanding Legal Understanding: The Legal Subject and the Problem of Legal Coherence,” 103 Yale Law Journal 105.
  • Binder, Guyora (1996/1999) “Critical Legal Studies,” in A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 280-290.
  • Bleicher, J. (1980) Contemporary Hermeneutics: Hermeneutics as Method, Philosophy and Critique. London, Boston: Routledge & Kegan Paul.
  • Brink, David (2001) “Legal Interpretation and Morality,” in B. Leiter (ed.), Objectivity in Law and Morals. Cambridge: Cambridge University Press.
  • Bruns, Gerald L. (1992) “Law and Language: A Hermeneutics of the Legal Text,” in Legal Hermeneutics: History, Theory, and Practice, ed. Gregory Leyh, Berkeley: University of California Press, 23-40.
  • Burley, Justine (ed.) (2004) Dworkin and His Critics: With Replies by Dworkin. Oxford: Blackwell.
  • Calabresi, Steven (2007) Originalism: A Quarter-Century of Debate. Washington, D.C.: Regnery Pub.
  • (Di) Cesare, Donatella (2005) “Reinterpreting Hermeneutics,” Philosophy Today, 49(4): 325-332.
  • Chamallas, Martha (2003) Introduction to Feminist Legal Theory. 2nd Ed. New York: Aspen Publishers.
  • Cho, S., K.W. Crenshaw and L. McCall (2013) “Toward a Field of Intersectionality Studies: Theory, Applications, and Praxis,” Signs, 38(4): 785-810.
  • Coleman, Jules (2001) The Practice of Principle. Oxford: Clarendon Press.
  • Crenshaw, K.W. (1991) “Mapping the Margins: Intersectionality, Identity Politics, and Violence against Women of Color,” Stanford Law Review, 43(6): 1241-99.
  • Crenshaw, K.W. (1989) “Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics,” University of Chicago Legal Forum, 140: 139-67.
  • Cross, Frank B. (2013) The Failed Promise of Originalism. Stanford: Stanford Law Books, an imprint of Stanford University Press.
  • Delgado, Richard (ed.) (1995) Critical Race Theory: The Cutting Edge. Philadelphia: Temple University Press.
  • Derrida, Jacques (1976) Of Grammatology. Baltimore: Johns Hopkins University Press.
  • Derrida, Jacques (1978) Writing and Difference. Trans. A. Bass. London: Routledge & Kegan Paul.
  • Derrida, Jacques (1990) “Force of Law: ‘The Mystical Foundation of Authority.’” Cordozo Law Review, 97, 1, 276.
  • Derrida, Jacques (1992) “Before the Law” in Acts of Literature. Ed. Derek Attridge. New York    and London: Routledge, 181-220.
  • Dickson, Julie (2001) Evaluation and Legal Theory. Oxford: Hart Publishing.
  • Dilthey, Wilhelm (1883) Introduction to the Human Sciences. In Makkreel, Rudolf A. and Frithjob Rodi, eds. 1989. Selected Works: Volume I: Introduction to the Human Sciences. Princeton: Princeton University Press.
  • Dostal, Robert J. (ed.) (2002) The Cambridge Companion to Gadamer. Cambridge: Cambridge University Press.
  • Douzinas, Costas, Ronnie Warrington and Shaun McVeigh (1991) Postmodern Jurisprudence: The Law of Text in the Texts of Law. London, New York: Routledge.
  • Dreyfus, H.L. and M.A. Wrathall (eds.) (2005) A Companion to Heidegger. Oxford: Blackwell.
  • Dreyfus, H.L. and M.A. Wrathall (eds.) (2002) Heidegger Reexamined (4 Volumes). London: Routledge.
  • Dworkin, Ronald (1983) “My Reply to Stanley Fish (and Walter Benn Michaels): Please Don’t Talk about Objectivity Any More,” in Mitchell, W.J.T., ed., The Politics of Interpretation. Chicago: University of Chicago Press, 287-313.
  • Dworkin, Ronald (1985) A Matter of Principle. Cambridge: Harvard University Press.
  • Dworkin, Ronald (1986) Law’s Empire. Cambridge: Harvard University Press.
  • Dworkin, Ronald (1996) “Objectivity and Truth: You’d Better Believe It,” Philosophy and Public Affairs, 25:88.
  • Dworkin, Ronald (2011) Justice for Hedgehogs. Cambridge: Harvard University Press.
  • Feldman, Stephen Matthew (1991) “The New Metaphysics: The Interpretive Turn in Jurisprudence,” Iowa Law Review, Vol. 76, 1991.
  • Figal, Günter (2010) Objectivity: The Hermeneutical and Philosophy. Albany: State University of New York Press.
  • Finnis, John (1980) Natural Law and Natural Rights. Oxford: Clarenden Press.
  • Finnis, John (ed.) (1991) Natural Law, 2 Vols. New York: New York University Press.
  • Finnis, John (1969) The Morality of Law. New Haven: Yale University Press, rev. edn.
  • Fiss, Owen (1982) “Objectivity and Interpretation,” Stanford Law Review, 34: 739-763.
  • Fish, Stanley (1999) Doing What Comes Naturally: Change, Rhetoric, and the Practice of Theory in Literary and Legal Studies. Durham, London: Duke University Press.
  • Fricker, M. (2007) Epistemic Injustice: Power and Ethics of Knowing, Oxford: Oxford University Press.
  • Fuller, Lon (1958) “Positivism and fidelity to law—a response to Professor Hart.” Harvard Law Review, 71: 630-72.
  • Gadamer, Hans-George (1975) Truth and Method. London: Sheed & Ward.
  • Gardner, John (2001) “Legal Positivism: 5 ½ Myths,” 46 American Journal of Jurisprudence, 199.
  • Geniusas, Saulius (2015) “Between Phenomenology and Hermeneutics: Paul Ricoeur’s Philosophy of Imagination,” Human Studies: A Journal for Philosophy and the Social Sciences, 38: 2, 223-241.
  • Gibgons, Michael T. (2006) “Hermeneutics, Political Inquiry, and Practical Reason: An Evolving Challenge to Political Science,” The American Political Science Review, 100(4): 563-571.
  • Goodwin, Liu (2010) Keeping Faith with the Constitution. Oxford: Oxford University Press.
  • Greenberg, Mark (2004) “How Facts Make Law,” Legal Theory, 10: 157-98.
  • Habermas, Jürgen (1971) “Der Universalitätsanspruch der Hermeneutik” (The Hermeutic Claim to Universality) in Karl- Otto Apel et al., eds., Hermeneutik und Ideologiekritik (Hermeneutics and Ideology) Frankfurt: Suhrkamp, 120-158.
  • Hart, H.L.A. (1961) The Concept of Law Oxford. Oxford: Oxford University Press.
  • Hart, H.L.A. (1958) “Positivism and the Separation of law and Morals.” Harvard Law Review 71: 593- 629.
  • Heidegger, Martin (2008) Being and Time. Trans. John Macquarrie and Edward Robinson. New York: HarperPerennial/Modern Thought. (Translation of Sein und Zeit. Reprint. Originally published: Harper & Row, 1962).
  • Hekman, Susan (1986) Hermeneutics and the Sociology of Knowledge. Notre Dame: University of Notre Dame Press.
  • Hershovitz, Scott (ed.) (2006) Exploring Law’s Empire. Oxford: Oxford University Press.
  • Hiley, David R. et al. (eds.) (1991) The Interpretive Turn: Philosophy, Science, Culture. Ithaca: Cornell University Press.
  • Hinchman, Lewis P. (1995) “Aldo Leopold’s Hermeneutic of Nature,” The Review of Politics, 57(2): 225-249.
  • Hoffman, S.J. (2003) “Gadamer’s Philosophical Hermeneutics and Feminist Projects,” in L. Code (ed.) Feminist Interpretations of Hans-Georg Gadamer, University Park: Penn State University Press.
  • Hoy, David Couzens (1992) “Intentions and the Law: Defending Hermeneutics,” in Legal Hermeneutics: History, Theory, and Practice. Ed. Gregory Leyh. Berkeley, Los Angeles, Oxford: University of California Press.
  • Hoy, David Couzens (1987) “Dworkin’s Constructive Optimism v. Deconstructive Legal Nihilism,” Law and Philosophy 6: 321-56.
  • Hoy, David Couzens (1985) “Interpreting the Law: Hermeneutical and Poststructuralist Perspectives,” Southern California Law Review 58: 136-76.
  • Hunt, Alan (1996, 1999) “Marxist Theory of Law”. In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 355-366.
  • Johnsen, Harald and Bjornar Olsen (1992) “Hermeneutics and Archaeology: On the Philosophy of Contextual Archaeology,” American Antiquity, 57(3): 419-436.
  • Kisiel, Theodore (1993) The Genesis of Heidegger’s Being and Time. Berkeley: University of California Press.
  • Kornprobst, Markus (2009) “International Relations as Rhetorical Discipline: Toward (Re-) Newing Horizons,” International Studies Review, 11(1): 87-108.
  • Levinson, Sanford and Steven Mailloux (1988) Interpreting Law and Literature: A Hermeneutic Reader. Evanston: Northwestern University Press.
  • Leyh, Gregory (1988) “Toward a Constitutional Hermeneutics.” American Journal of Political Science. 32(2): 369-387.
  • Leyh, Gregory (ed.) (1992) Legal Hermeneutics: History, Theory, and Practice. Berkeley, Los Angeles, Oxford: University of California Press.
  • Lieber, Francis (2010/1880) Legal and Political Hermeneutics. Lawbook Exchange, Ltd.
  • Malpas, Jeff and Hans-Helmuth Gander (eds.) (2014) The Routledge Companion to Hermeneutics. London, New York: Routledge, Taylor & Francis Group.
  • Malpas, Jeff and Hans-Helmuth Gander (1992) “Analysis and Hermeneutics,” Philosophy & Rhetoric, 25(2): 93-123.
  • Minda, Gary (1995) Postmodern Legal Movements. New York, London: New York University Press.
  • Monaghan, Henry Paul (2004) “Doing Originalism,” Columbia Law Review, 104(1): 32-38.
  • Mootz, Francis J., III (1994) “The New Legal Hermeneutics,” 47 Vand. L. Rev. 116.
  • Mootz, Francis J., III (1988) “The Ontological Basis of Legal Hermeneutics: A Proposed Model of Inquiry Based on the Work of Gadamer, Habermas, and Ricoeur,” 68 B.U.L. Rev. 523.
  • Murphy, Mark C. (2006) Natural Law in Jurisprudence & Politics. Cambridge: Cambridge University Press.
  • Nenon, Tom (1995) “Hermeneutical Truth and the Structure of Human Experience: Gadamer’s Critique of Dilthey,” in The Specter of Relativism: Truth, Dialogue, and Phronesis in Philosophical Hermeneutics, ed. Schmidt, Lawrence. Evanston: Northwestern University Press. 39-55.
  • Palmer, Richard E. (1969) Hermeneutics: Interpretation Theory in Schleiermacher, Dilthey, Heidegger, and Gadamer. Evanston: Northwestern University Press.
  • Pashukanis, Evgeny (1924/2002) The General Theory of Law and Marxism. New Brunswick: Transaction Publishers.
  • Pérez-Gómez, Alberto (1999) “Hermeneutics as Discourse in Design,” Design Issues, 15(2) Design Research: 71-79.
  • Pinton, Giorgio Alberto (1972/1973) “Emilio Betti’s (1890-1969) Theory of General Interpretation.” Ph.D. Dissertation. Hartford Seminary Foundation.
  • Ricoeur, P. (1974) The Conflict of Interpretations. Evanston: Northwestern University Press.
  • Rorty, Richard (2007) Philosophy as Cultural Politics. Cambridge: Cambridge University Press.
  • Rorty, Richard (2000) Philosophy and Social Hope. CITY: Penguin.
  • Rorty, Richard (1998) Achieving Our Country: Leftist Thought in Twentieth Century America. Cambridge: Harvard University Press, 1998.
  • Rorty, Richard (1991) Essays on Heidegger and Others: Philosophical Papers, Volume 3. Cambridge: Cambridge University Press.
  • Rorty, Richard (1979) Philosophy and the Mirror of Nature. Princeton: Princeton University Press.
  • Scalia, Antonin (1997) A Matter of Interpretation: Federal Courts and the Law: An Essay. Princeton: Princeton University Press.
  • Scalia, Antonin and Bryan Garner (2010) Reading Law: The Interpretation of Legal Texts. St. Paul: Thomson/West.
  • Schmidt, Lawrence (2006) Understanding Hermeneutics. Stocksfield: Acumen Publishing Ltd.
  • Surette, Leon (1995) “Richard Rorty Lays Down the Law,” Philosophy and Literature, 19(2): 261-275.
  • Taylor, George H. (2000) “Hermeneutics and Critique in Legal Practice: Critical Hermeneutics: The Intertwining of Explanation and Understanding as Exemplified in Legal Analysis,” 76 Chi.-Kent L. Rev. 1101.
  • Tugendhat, Ernst (1992) “Heidegger’s Idea of Truth.” trans. Christopher Macann, in Macann, Christopher (ed.) Martin Heidegger: Critical Assessments. 4 Vols. London: Routledge.
  • Valauri, John T. (2010) “As Time Goes By: Hermeneutics and Originalism,” Nevada Law Review, August 23, 2010.
  • Wachterhauser, Brice R (ed.) (1994) Hermeneutics and Truth. Evanston: Northwestern University Press.
  • Walby, S. (2007) “Complexity Theory, Systems Theory and Multiple Intersecting Social Inequalities,” Philosophy of the Social Sciences, 37(4): 449-70.
  • Warnke, Georgia (1993) Justice and Interpretation. Cambridge: MIT Press.
  • Wolf, Friedrich August (1831) Vorlesung über die Enzyklopadie der Altertumswissenschaft. Vorlesungen über die Altertumswissenschaft series, Ed. J.D. Gürtler, Vol. I. Leipzig: Lehnhold.

b. Further Reading

  • Attridge, Derrick (ed.) (1992) Acts of Literature. Ed. Derek Attridge. New York,   London: Routledge. 181-220.
  • Austin, John (1832) The Province of Jurisprudence Determined.
  • Austin, John (1987) “Deconstructive practice and legal theory.” Yale Law Journal, 96, 743.
  • Barnett, Randy E. (1995-1996) The Relevance of the Framers’ Intent. 19 Harv. J. L. & Pub. Pol’y 403.
  • Bentham, Jeremy (1970) Of Laws in General. Ed. H.L.A. Hart. London: University of London, Athlone Press.
  • Binder, Guyora, and Robert Weisberg. Literary Criticisms of Law. Princeton: Princeton University Press, 2000.
  • Bix, Brian (1999) Natural Law Theory. In A Companion to Philosophy of Law and Legal Theory. Ed.             Dennis Patterson. Malden, Oxford: Blackwell Publishing.
  • Bobbit, Phillip (1996, 1999) “Constitutional Law and Interpretation.” In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing.
  • Bobbit, Phillip (1982) “A Typology of Constitutional Arguments.” Constitutional Fate: Theory of the Constitution. Oxford: Oxford University Press.
  • Brennan, Jr. William (1986) “The Constitution of the United States: Contemporary Ratification,” The Great Debate: Interpreting Our Written Constitution. Washington, D.C.: The Federalist Society.
  • Brest, Paul (1980) “The Misconceived Quest for the Original Understanding,” 60 B.U. L. Rev. 204.
  • Campos, Paul (1992) “Against Constitutional Theory,” Yale Journal of Law and the Humanities 4.
  • Campos, Paul (1993) “That Obscure Object of Desire: Hermeneutics and the Autonomous Legal Text,” Minnesota Law Review 77.
  • Cicero, Marcus Tullius (1988) De Re Publica; (On the Commonwealth). trans. C.W. Keyes, Cambridge: Harvard University Press.
  • Cicero, Marcus Tullius (1988) De Legibus (On the Laws). trans. C.W. Keyes, Cambridge: Harvard University Press.
  • Delgado, Richard (2012) “Centennial Reflections on the California Law Review’s Scholarship on Race: The Structure of Civil Rights Thought,” 100 Calif. L. Rev. 431.
  • Derrida, Jacques (1986) “But beyond…(Open Letter to Anne McClintock and Rob Nixon)” Trans. Peggy Kamuf. Critical Inquiry 13 (Autumn). 167-168.
  • Derrida, Jacques (1973) “Différance.” Speech and Phenomena, and Other Essays on Husserl’s Theory of Signs. Evanston: Northwestern University Press.
  • Dilthey, Wilhelm (1958) Gesammelte Schriften. Vol. II. Stuttgart: B.G. Teubner.
  • Epictetus (1926-1928) The Discourses as reported by Arrian, the Manual, and Fragments. London, W. Heinemann; New York, G.P. Putnam’s Sons.
  • Epictetus (2008) Enchiridion. Auckland: Floating Press.
  • Eskridge, Jr., William N. (1994) “Gaylegal Narratives,” 46 Stan. L. Rev. 607, 633.
  • Eskridge, Jr., William N. (1990) “Gadamer/Statutory Interpretation,” 90 Colum. L. Rev. 609.
  • Feinberg, Joel and Jules Coleman (2008) Philosophy of Law. Belmont: Thomson Wadsworth.
  • Fish, Stanley (1984) “Fiss v. Fiss,” 36 Stanford Law Review.
  • Fiss, Owen (1982) “Objectivity and Interpretation,” 34 Stanford Law Review 739.
  • George, Theodore D (2014) “Remarks on James Risser’s ‘The Life of Understanding: A Contemporary Hermeneutics,’” Philosophy Today, 58(1): 107-116.
  • Grotius, Hugo (1625) De iure belli ac pacis libri tres. Paris: Buon
  • Heidegger, Martin (1999) Ontology: Hermeneutics of Facticity. John van Buren (trans.) Bloomington: Indiana University Press.
  • Hoy, David Couzens (1978) The Critical Circle: Literature, History, and Philosophical Hermeneutics. Berkeley, Los Angeles, Oxford: University of California Press.
  • Hutchinson, A. (ed.) (1989) Critical Legal Studies. Totowa: Rowman & Littlefield.
  • Jones, Bernie D. (2002) “Critical Race Theory: New Strategies for Civil Rights in the New Millennium?” 18 Harv. BlackLetter J. 1.
  • Ladson-Billings, Gloria, “Race…to the Top, Again: Comments on the Genealogy of Critical Race Theory,” 43 Conn. L. Rev. 1439.
  • Levinson, Sanford (1980) “Law as Literature,” 60 Texas Law Review 373-403.
  • Levit, Nancy and Robert R.M. Verchick, Feminist Legal Theory, New York, London: New York University Press.
  • Mahmud, Tayyab (2014) “Foreword: Looking Back, Moving Forward: Latin Roots of the Modern Global and Global Orientation of Latcrit,” 12 Seattle J. Soc. Just. 699.
  • Levit, Nancy and R.M. Verchick (eds.) (2006) Feminist Legal Theory: A Primer. New York, London: New York University Press:.
  • MacKinnon, C. (2013) “Intersectionality as a Method: A Note,” Signs, 38:4, Intersectionalty: Theorizing Power, Empowering Theory, 1019-30.
  • Marx, Karl (1967) Capital. A Critical Analysis of Capitalist Production, Vol. 1, New York: International.
  • Marx, Karl (1859/1994) Preface to A Contribution to the Critique of Political Economy, in Simon, supra, 209.
  • Parks, Gregory Scott (2008) “Note: Toward a Critical Race Realism,” 17 Cornell J.L. & Pub. Pol’y 683.
  • Patterson, Dennis (ed.) (2003) Philosophy of Law and Legal Theory. Malden: Blackwell Publishing.
  • Patterson, Dennis (1996/1999) “Postmodernism.” In A Companion to Philosophy of Law and Legal Theory. Ed. Dennis Patterson, Malden: Blackwell Publishing, 375-384.
  • Patterson, Dennis (1996) Law and Truth. Oxford, New York: Oxford University Press.
  • Poteat, W. H. (1985) Polanyian Meditations: in Search of a Post-Critical Logic. Durham: Duke University Press.
  • Raz, Joseph (2009) Between Authority and Interpretation: On the Theory of Law & Practical Reason, Oxford: Oxford University Press.
  • Rorty, R. (1988) The Linguistic turn: recent essays in philosophical method. Midway Reprint edition. Chicago: University of Chicago Press.
  • Rorty, R. (1979) Philosophy and the mirror of nature. Princeton: Princeton University Press.
  • Schmidt, Lawrence (ed.) (1995) The Specter of Relativism: Truth, Dialogue, and Phronesis in Philosophical Hermeneutics, Evanston: Northwestern University Press.
  • Simmonds, N.E. (2007) Law as a Moral Idea, Oxford: Oxford University Press.
  • Teo, Thomas (2011) “Empirical Race Psychology and the Hermeneutics of Espistemological Violence,” Human Studies, 34(3): 237-255.
  • Thorsteinsson, Björn (2015) “From ‘Différance’ to Justice: Derrida and Heidegger’s ‘Anaximander’s Saying,” Continental Philosophy Review, 48(2): 255-271.
  • Tushnet, Mark V. (1983) “Following the Rules Laid Down: A Critique of Interpretivism and Neutral Principles” 96 Harvard Law Review 781.
  • Valauri, John T. (1991) “Constitutional Hermeneutics” in The Interpretive Turn: Philosophy, Science, Culture, David R. Hiley, James F. Bohman, and Richard Shusterman, eds. Ithaca: Cornell University Press.
  • Valdes, Francisco (2003) “Outsider Jurisprudence, Critical Pedagogy and Social Justice Activism: Marking the Stirrings of Critical Legal Education,” Asian American Law Journal, 10(1): Article 7.
  • Vedder, Ben (2002) “Religion and Hermeneutic Philosophy,” International Journal for Philosophy and Religion, 51(1): 39-54.
  • Warner, Richard (1996, 1999) “Legal Pragmatism” In A Companion to Philosophy of Law and Legal Theory. ed. Dennis Patterson, Malden: Blackwell Publishing, 385-393.
  • West, Robin L. (2000) “Commentary: Are There Nothing but Texts in this Class? Interpreting the Interpretive turns in Legal Thought,” Chicago-Kent College of Law Chicago-Kent Law Review, 76 Chi.-Kent L. Rev. 1125, 19547.
  • Whyte, Megan K. “Going Back to Class? The Reemergence of Class in Critical Race Theory Symposium: Introduction: From Discourse to Struggle: A New Direction in Critical Race Theory,” 11 Mich. J. Race & L. 1.

 

Author Information

Tina Botts
Email: tina.botts@oberlin.edu
Oberlin College
U. S. A.

Charles Sanders Peirce: Logic

C. S. PeirceCharles Sanders Peirce (1839-1914) was an accomplished scientist, philosopher, and mathematician, who considered himself primarily a logician. His contributions to the development of modern logic at the turn of the 20th century were colossal, original and influential. Formal, or deductive, logic was just one of the branches in which he exercized his logical and analytical talent. His work developed upon Boole’s algebra of logic and De Morgan’s logic of relations. He worked on the algebra of relatives (1870-1885), the theory of quantification (1883-1885), graphical or diagrammatic logic (1896-1911), trivalent logic (1909), higher-order and modal logics. He also contributed significantly to the theory and methodology of induction, and discovered a third kind of reasoning, different from both deduction and induction, which he called abduction or retroduction, and which he identified with the logic of scientific discovery.

Philosophically, logic became for Peirce a broad discipline with internal divisions and external architectonic relations to other parts of scientific inquiry. Logic depends upon, or draws its principles from, mathematics, phaneroscopy (=phenomenology), and ethics, while metaphysics and psychology depend upon logic. One of the most important characters of Peirce’s late logical thought is that logic becomes coextensive with semeiotic (his preferred spelling), namely the theory of signs. Peirce divides logic, when conceived as semeiotic, into (i) speculative grammar, the preliminary analysis, definition, and classification of those signs that can be used by a scientific intelligence; (ii) critical logic, the study of the validity and justification of each kind of reasoning; and (iii) methodeutic or speculative rhetoric, the theory of methods. Peirce’s logical investigations cover all these three areas.

Table of Contents

  1. Logic among the Sciences
  2. Logic as Semeiotic
    1. Speculative Grammar
    2. Logical Critics
      1. From Three Types of Inference to Three Stages of Inquiry
      2. Abductive Logic
      3. Deductive Logic
      4. Inductive Logic
    3. Methodeutic
  3. Peirce’s Logic in Historical Perspective
  4. References and Further Reading

1. Logic among the Sciences

Peirce’s idea of logic is guided by finding the location of logic in the map of the sciences. Peirce’s mature classification of the sciences (CP 1.180-202, 1903; see Brent 1987), which is a “ladder-like scheme” (MS 328, p. 20, c. 1905), takes superordinate sciences to provide principles to subordinated sciences, forming a ladder of decreasing generality.

According to Peirce’s 1903 scheme, which he as late as 1911 considered a satisfactory account (MS 675), sciences are either sciences of discovery, sciences of review, or practical sciences. Logic is a science of discovery. The sciences of discovery are divided into mathematics, philosophy and idioscopy. Mathematics studies the necessary consequences of purely hypothetical states of things. Philosophy, by contrast, is a positive science, concerning matters of fact. Idioscopy embraces more special physical and psychical sciences, and depends upon philosophy. Philosophy in turn divides into phaneroscopy, normative sciences and metaphysics. Phaneroscopy is the investigation of what Peirce calls the phaneron: whatever is present to the mind in any way. The normative sciences (aesthetic, ethics, and logic) introduce dichotomies, in that they are, in general, the investigation of what ought and what ought not to be. Metaphysics gives an account of the universe in both its physical and psychical dimensions. Since every science draws its principles from the ones above it in the classification, logic must draw its principles from mathematics, phaneroscopy, aesthetics and ethics, while metaphysics, and a fortiori psychology, draw their principles from logic (EP 2, pp. 258-262, 1903).

In sharp contrast to the logicist hypothesis, Peirce did not believe that mathematics depends upon deductive logic. On the contrary, in a sense it is deductive logic that depends upon mathematics. For Peirce, mathematics is the practice of deduction, logic its description and analysis: Peirce’s father Benjamin Peirce had defined mathematics as the science which draws necessary conclusions (B. Peirce 1870, p. 1). Hence deductive logic for Charles became the science of drawing necessary conclusions (CP 4.239, 1902). Logic cannot furnish any justification of a piece of deductive reasoning: deduction in general is in the first place mathematically, rather than logically, valid. And deductive logic is at any rate only a part of logic: “logic is the theory of all reasoning, while mathematics is the practice of a particular kind of reasoning” (MS 78, p. 4; see Haack 1993 and Houser 1993). Logic rather draws its principles from phaneroscopy, as the latter analyzes the structure of appearance but does not pronounce upon the veracity of such appearance. Logic also draws its principles from the normative sciences of ethics and esthetics (Peirce’s preferred spelling), which precede normative logic in the ladder of generality. Ethics depends on esthetics because ethics draws from esthetics the principles involved in the idea of a summum bonum, the highest good. Since ethics is the science that distinguishes good from bad conduct, it must be concerned with deliberate, self-controlled, conduct, because only by deliberate conduct is it possible to say whether the conduct is good or bad. Logic treats of a special kind of deliberate conduct, thought, and distinguishes good from bad thinking, that is, valid from invalid reasoning. Since deliberate thought is a species of deliberate conduct, logic must draw its principles form ethics (CP 5.120-50; EP 2, pp. 196-207, 1903)

Of the sciences down the ladder of generality, metaphysics and psychology come out next. Peirce had learnt from Kant that metaphysical conceptions mirror those of formal logic. Peirce’s criticism ever since the 1860s had been that Kant’s table of categories was mistaken not because he based them upon formal logic but because the formal logic that Kant had used was itself poor and ultimately wrong (see NEM 4, p. 162, 1898). The only way to arrive at a good metaphysics is to begin with a good logical theory (EP 2, pp. 30-31, 1898). Psychology, too, depends upon logic. According to Peirce, different versions of logical psychologism characterized the logics of his time, especially in Germany. Logic for Peirce considers not what or how we in fact think but how we ought to think; logic is a normative, not a descriptive, science. The validity of an argument consists in the fact that its conclusion is true, always or for the most part, when its premises are true; it has nothing to do with reference to a mind. Logical necessity is a necessity of (non-empirical) facts, not a necessity of thinking. No appeal to psychology is thereby of any aid in logic. On the contrary, it is psychology that stands in the need of a science of logic (EP 2, pp. 242-257, 1903).

2. Logic as Semeiotic

In the 1890s (MS 595, 787), Peirce divided logic into three branches: speculative grammar (also called stechiology), logical critics (or just critics) and methodeutic (also called speculative rhetoric). The division echoes the three sciences of the medieval Trivium: grammar, dialectic and rhetoric.

Perhaps the most salient character of Peirce’s logic as a whole is that in his later works (MS L 75, 1902; MS 478, 1903; MS 693, 1904; MS 640, 1909) logic is identified with semeiotic, the science and philosophy of signs and representations. Already in his early works on the theory of inference Peirce had affirmed that logic is the branch of semeiotic that treats of one particular kind of representations, namely symbols, in their reference to their objects (W1, p. 309, 1865). By the beginning of the 20th-century, he had shifted from the idea of “logic-within-semeiotic” to that of “logic-as-semeiotic.” He thus needed to distinguish between logic in the narrow sense, which he now calls logical critics, and logic in the wide sense; the latter is made coextensive with semeiotic. “Logic, in its general sense, is, as I believe I have shown, only another name for semiotic (σημειωτική), the quasi-necessary, or formal, doctrine of signs” (CP 2.227, c.1897; cf. Fisch 1986, pp. 338-341).

According to Peirce’s mature views, an enlargement of logic to cover all varieties of signs was a valuable methodological guidance to the building of an objective, anti-psychological and formal logical theory: “The study of the provisional table of the Divisions of Signs will, if I do not deceive myself, help a student to many a lesson in logic” (MS S 46, 1906; cf. MS 283, c. 1905; MS 675-676, 1911; MS 12, 1912). Therefore, his logic contains, as a proper part of it, a study of its own scope and expansions. In homage to Thomas of Erfurt’s grammatica speculativa, which at Peirce’s times was misattributed to Duns Scotus, Peirce names this part of logic “speculative grammar.”

a. Speculative Grammar

In the 1890s Peirce regarded speculative grammar as an analysis of the nature of assertion (MS 409-8, 1894; CP 3.432, 1896, MS 787, c. 1897). Starting with the Syllabus of his 1903 Lowell Lectures (A Syllabus of Certain Topics of Logic, MS 478, MS 540), speculative grammar becomes a classification of signs. In the Syllabus Peirce defines a Sign or Representamen as

the first Correlate of a triadic relation, the second Correlate being termed its Object, and the possible Third Correlate being termed its Interpretant, by which triadic relation the possible Interpretant is determined to be the first correlate of the same triadic relation to the same Object, and for some possible Interpretant. (MS 540, CP 2.242; EP 2, p. 290).

A sign for Peirce is something that represents an independent object and which thereby brings another sign, called interpretant, to represent that object as the sign does. According to a long tradition in the history of logic, Peirce declares that the principal classes of signs that logic is concerned with are terms, propositions, and arguments. But by 1903 these three elements become parts of a larger taxonomic scheme.

Since the Syllabus and until at least 1909 Peirce continued experimenting with principles and terminologies, without however settling on any definitive division. This section presents the main principles of the Syllabus classification.

Signs are divisible by three trichotomies; first, according as the sign in itself is a mere quality, is an actual existent, or is a general law; secondly, according as the relation of the sign to its Object consists in the sign’s having some character in itself, or in some existential relation to that Object, or in its relation to an Interpretant; thirdly, according as its Interpretant represents it as a sign of possibility, or as a sign of fact, or a sign of reason. (CP 2.243, 1903)

The first trichotomy considers signs as (i) tones, when taken in their material qualities (such as the blueness of the ink), as (ii) tokens (such as any instance of the word “the”), and as (iii) general types (such as the word “the”). The second trichotomy is the best known, namely that of (i) icons, or signs that bear similarity or resemblance to their objects, (ii) indices, which have factual connections to their objects, and (iii) symbols, which have rational connections to their objects. The third trichotomy divides signs into terms, propositions, and arguments: Through his work on the logic of relatives (see § 2.b.iii.), Peirce had come to consider the terms as (i) rhemas, which are unsaturated predicates with logical bonds or subject-positions, in some ways similar to Frege’s Begriff and Russell’s propositional function; (ii) propositions, which unify subject and predicate and thus assert, or as in the Syllabus are dicisigns, signs that tell, and (iii) arguments, which embody the ultimate perfection and end of signs as a representation of facts that are signs of other facts, such as the premises being the sign of the conclusion. [Peirce’s theory of the proposition is articulated and highly original and has been thoroughly investigated in Hilpinen 1982, 1992; Ferriani 1987; Chauviré 1994; Stjernfelt 2014.]

There are cross-divisions of these three trichotomies across speculative grammar. A term or rhema is a symbol which is represented by its interpretant as an icon of its object, while a proposition or dicisign is a symbol which is represented by its interpretant as an index of its object. Arguments themselves are considered as symbols that represent their conclusion in three different ways: iconically in abduction, indexically in deduction, and symbolically in induction. (In an early cross-division proposed in 1867 these last two were interchanged. See W2, p. 58). Other outcomes of the classifications consisted in further divisions of objects and interpretants into various subtypes. [For more on Peirce’s classifications, see Weiss & Burks 1945; Short 2007, chs. 7-9; and Burch 2011.]

Grammatical taxonomy shows that there are three kinds of arguments, each manifesting a different semiotic principle. But it is up to the second branch of logic, critics, to investigate the question of logical validity and justification of such arguments. The analysis of the conditions of validity of these three kinds of reasoning is a critical, not grammatical, question.

b. Logical Critics

i. From Three Types of Inference to Three Stages of Inquiry

Logical critics is the heart of Peirce’s logic. It cover what usually goes under the name of logic proper, that is, the investigation of inference and arguments. Many 19th-century logicians (for example, John S. Mill, George Boole, John Venn and William Stanley Jevons) took the range of logic to include deductive as well as inductive logic. As appears from the classification, the remarkable novelty of Peirce’s logical critics is that it embraces three essentially distinct though not entirely unrelated types of inferences: deduction, induction, and abduction. Initially, Peirce had conceived deductive logic as the logic of mathematics, and inductive and abductive logic as the logic of science. Later in his life, however, he saw these as three different stages of inquiry rather than different kinds of inference employed in different areas of scientific inquiry.

Peirce had formulated a definite theory of logical leading principles early in the late 1860s. His argument is roughly as follows. In any inference, we pass from some fact to some other fact that follows logically from it. The former is the premise (for in cases where there is more than one they may be colligated or compounded into one copulative premise), the latter is the conclusion.

P
∴C

The conclusion follows from the premise logically, that is, according to some leading principle, L. As logic supposes inferences to be analyzed and criticized, as soon as the logician asks what is it that warrants the passing from such premise to the conclusion she is obliged to express the leading principle L in a proposition and to lay it down as an additional premise:

P
L
∴C

This gives what Peirce calls a complete argument, in opposition to incomplete, rhetorical or enthymematic arguments. This second argument has itself its own leading principle L1, which may again be expressed in a proposition and laid down as a further premise:

P
L
L1
∴C

When L1 is not a substantially different leading principle than L, then L is said to be a logical leading principle. In Peirce’s words:

This second argument has certainly itself a leading principle, although it is a far more abstract one than the leading principle of the original argument. But you might ask, why not express this new leading principle as a premise, and so obtain a third argument having a leading principle still more abstract? If, however, you try the experiment, you will find that the third argument so obtained has no more abstract a leading principle than the second argument has. Its leading principle is indeed precisely the same as that of the second argument. This leading principle has therefore attained a maximum degree of abstractness; and a leading principle of maximum abstractness may be termed a logical principle. (NEM 4, p.175, 1898)

A logical leading principle is therefore a formal or logical proposition which, when explicitly stated, adds nothing to the premises of the inference which it governs. The central question of logical critics becomes that of determining different kinds of logical leading principles.

Peirce’s initial strategy to prove that there are three and only three irreducible kinds of reasoning was to use syllogism. He gave the demonstration that the second and the third figure are reducible to the first only through the employment of the very figure that is to be reduced. The principles involved in the three syllogistic figures cannot then be reduced to a combination of other, more primitive principles, as they invariably enter as parts into the reduction proof itself. From this Peirce drew the broader conclusion that the three figures of syllogism correspond to the three kinds of inference in general: deduction corresponds to the first figure, abduction to the second, and induction to the third:
Peirce graphic

In Peirce 1878 and Peirce 1883, abduction and induction are described as inversions of a deductive syllogism. If we call the major premise of a syllogism in the first figure Rule, its minor premise Case, and its conclusion Result, then abduction may be said to be the inference of a Case from a Result and a Rule, while induction may be said to be the inference of a Rule from a Case and a Result:

Peirce graphic

Later in 1903 Peirce had come to the conclusion that the three kinds of reasoning are in fact three stages in scientific research. First comes abduction, now often also called retroduction, by which a hypothesis or conjecture that explains some surprising fact is set forth. Then comes deduction, which traces the necessary consequences of the hypothesis. Lastly comes induction, which puts those consequences to test and generalizes its conclusions.

Any inquiry is for Peirce bound to follow this pattern: abduction–deduction–induction. Each kind of inference retains its validity and modus operandi and is logically irreducible to either of the others; yet all three of them are necessary in any complete process of inquiry. Of the three methods, Peirce took deduction to be the most secure and the least fertile, while abduction is the most fertile and the least secure.

All these three departments of critics epitomize the originality of Peirce’s contributions. The following sections deals with Peirce’s abductive, deductive, and inductive logics, respectively.

ii. Abductive Logic

The central question of abductive or retroductive logic is: is there a logic of scientific discovery? If yes, what are its justification and method?

Initially, Peirce described abduction as the inference of a Case from a Rule and a Result:

Hypothesis proceeds from Rule and Result to Case; it is the formula of the […] process by which a confused concatenation of predicates is brought into order under a synthetizing predicate. (Peirce 1883, p. 145)

Its general formula is this:

Result: S is M1 M2 M3 M4
Rule: P is M1 M2 M3 M4
Case: Therefore, S is P.

A certain number of surprising facts have been observed which call for explanation, and a single predicate embracing all of them is found which would explain them. When I notice that light manifests such-and-such complicated and surprising phenomena, and I know that ether waves exhibits those same phenomena, I conclude abductively that, if light were ether waves, it would be normal for it to manifest those phenomena. This offers rational ground for the hypothesis that light is ether waves.

In 1900, Peirce began viewing this description of abduction as inadequate. What he in 1883 had called hypothesis or abduction was actually induction about characters instead of things and is therefore better to be called qualitative induction (see § 2.b.iv): its leading principle is inductive and not abductive. Abduction is no longer constrained by the syllogistic framework. Most generally, it is the non-inductive process of forming an explanatory hypothesis. In Peirce’s words, abduction “is the only logical operation which introduces any new idea” (CP 5.172, 1903). Although abduction asserts its conclusions only conjecturally, it has a definite logical form. The following has become its standard, albeit not the ultimately satisfactory, description after Peirce’s pronouncement of it in the seventh of the Harvard Lectures of 1903:

The surprising fact, C, is observed;
But if A were true, C would be a matter of course,
Hence, there is reason to suspect that A is true. (CP 5.189, 1903)

This schema reveals why abduction is also called retroduction: it is reasoning that leads from a consequent of an admitted consequence to its antecedent.

Another description of the logical form of abduction is contained in a later, unpublished manuscript:

In the inquiry, all the possible significant circumstances of the surprising phenomenon are mustered and pondered, until a conjecture furnishes some possible Explanation of it, by which I mean a syllogism exhibiting the surprising fact as necessarily following from the circumstances of its occurrence together with the truth of the conjecture as premisses. (MS 843, p. 41, 1908)

The explaining syllogism is the inversion of the 1903 formula:

If A were true, C would be observable.
A is true.
Therefore, C is observable.

One more and hitherto unknown formulation of retroduction is found in an unpublished letter to Lady Welby:

[The] “interrogative mood” does not mean the mere idle entertainment of an idea. It means that it will be wise to go to some expense, dependent upon the advantage that would accrue from knowing that Any/Some S is M, provided that expense would render it safe to act on that assumption supposing it to be true. This is the kind of reasoning called reasoning from consequent to antecedent. For it is related to the Modus Tollens thus:

Peirce graphic

Instead of “interrogatory”, the mood of the conclusion might more accurately be called “investigand”, and be expressed as follows:

It is to be inquired whether A is not true.

The reasoning might be called “Reasoning from Surprise to Inquiry..” (Peirce to Welby, July 16, 1905, LoF, pp. 907-908)

The whole course of thought, consisting in noticing the surprising phenomenon, searching for pertinent circumstances, asking a question, forming a conjecture, remarking that the conjecture appears to explain the surprising phenomenon, and adopting the conjecture as plausible, constitutes the first, abductive stage of inquiry. Nonetheless, its crucial phase is that of forming the conjecture itself. This is often described by Peirce as an act of insight, or an instinct for guessing right, or what Galileo called il lume naturale. [That Peirce actually got the phrase from Galileo has sometimes been contested. But see the story by Victor Baker in Bellucci, Pietarinen & Stjernfelt (2014) on the “Myth of Galileo.” Baker refers to Jaime Nubiola’s finding of Peirce’s copy of Galileo’s Opere that had that phrase underlined in Peirce’s hand. That fifteen volumes edition was at least in 2012 still to be found at the Robbins Library at the Department of Philosophy, Harvard University.]

However, to pronounce reasoning to be instinctive would amount to excluding it from the realm of logic. For logic only considers reasoning, and reasoning is a deliberate act subject to self-control. According to Peirce, abduction is an inference type based upon a logical principle. In its most abstract shape, such logical principle gives abduction its justification, and the justification of abduction is the bottom question of logical critics (EP 2, p. 443, 1908).

According to Peirce, abduction “consists in studying the facts and devising a theory to explain them” (CP 5.145, 1903). Its only justifications are that “if we are ever to understand things at all, it must be in that way,” and that “its method is the only way in which there can be any hope of attaining a rational explanation” (CP 2.777, 1902). The only justification for a hypothesis is that it might explain the facts. But in general, an inference is valid if its leading principle is an instance of a logical principle which is conducive to the acquisition of new information. Therefore, the logical leading principle of all abductions is that nature, in general, is explainable. To suppose something inexplicable is contrary to the principles of logic: such supposition only has the appearance of an explanation conducive to the acquisition of new information, but to really suppose something inexplicable is to renounce knowledge.

That nature is explicable is therefore the primary abduction underlining all possible abductions. Human powers of insight may well be justified also inductively, that is, as testified by the history of science. But abduction’s primary justification is abductive rather than inductive: if we are to acquire new knowledge at all, sooner or later we must reason abductively. [See Bellucci & Pietarinen 2014; Burks 1946; Fann 1970; Kapitan 1992; Kapitan 1997; Paavola 2004 for further details on Peirce’s theory of abduction or retroduction, and Ma & Pietarinen 2015 for a dynamic logic approach to Peirce’s interrogative abduction.]

Of these three stages of reasoning, abduction is the most fertile but the least secure. For this reason, Peirce affirms that abduction is the principal kind of reasoning in which, after logical critics has pronounced it valid, it remains to be inquired whether and how it is advantageous. To carry out such tasks pertains to the third branch of logic, methodeutic, which is discussed in § 2.c below.

iii. Deductive Logic

The works of George Boole and Augustus De Morgan provided the essential backdrop for Peirce’s development of deductive logic. Also Benjamin Peirce’s Linear Associative Algebra (B. Peirce 1870) influenced his son’s early development of algebraic logic of relatives. Peirce’s dissatisfaction with how Boole represented syllogisms as algebraic equations led him to develop new algebraic approaches to logic, which he did by combining Boole’s calculus (Boole 1847, 1854) with De Morgan’s treatment of relations (De Morgan 1847, 1860).

Some of Peirce’s most important contributions to the development of modern logic are highlighted below.

1867: In the paper “An Improvement in Boole’s Calculus of Logic” published in the Proceedings of the American Academy of Arts and Sciences (Peirce 1867), Peirce subscribed all operation symbols with a comma to differentiate the logical from the arithmetical interpretation. He also made Boole’s union operator inclusive rather than exclusive (he was anticipated in this by Jevons 1864). Peirce became aware of the limitations of Boole’s algebraic logic, such as that it cannot express categorical propositions (Some X is Y), so it fails to properly represent quantification.

1870: Peirce’s development of De Morgan’s theory of relations is fully exposed in his paper “Description of a Notation for the Logic of Relatives, resulting from an Amplification of the Conceptions of Boole’s Calculus of Logic” (Peirce 1870), which was communicated to the American Academy of Arts and Sciences in January 1870. In this paper, Peirce combines De Morgan’s theory with Boole’s calculus. The result is a logical algebra equivalent in expressive power to first-order predicate logic without identity. Peirce’s paper introduces a number of original innovations. Among them is a new process of logical differentiation, explained in Welsh (2012, pp.166-180). Peirce also introduces the copula of inclusion, illation symbol, later also termed the sign of illation and expressed in cursive form as cursive illation symbol. Inclusion is for Peirce a wider and logically simpler concept than that of equality. This difference marks another important departure from Boole’s mathematical algebra towards new types of logical algebras. Beginning with C. I. Lewis’s (1918) comments on Peirce’s algebra, the literature has discussed whether in this 1870 paper what Peirce calls relative terms are to be equated with relations (see Merrill 1997 for a summary and further references). It appears that in the very least they are what verbs and phrases express linguistically, such as lover of___, whatever is a lover of­____, or buyer of____for____from____. Here blanks stand for nouns that are required to complete the expressions. Beginning in 1882, Peirce comes to define relative terms as classes of ordered pairs ( later understood as being relations). He denotes such terms by rhemas with blanks for expressions standing for subjects, such as ­­____is a lover of____, whatever____is a lover of­____ and so forth. Of note is that the algebra of Peirce’s 1870 paper is able to express various forms of quantification although the term “quantifier” and its modern conception was to emerge only later in his works after the early 1880s.

1880a: The long paper “On the Algebra of Logic” (Peirce 1880a), which was published in the American Journal of Mathematics, introduces a number of further developments of which the following six are listed here:

(1) The copula, expressed as a binary relation between classes of propositions,  Pi illation symbol C, is now understood to express the notion of the semantic consequence, namely that “every state of things in which a proposition of the class Pi is true is a state of things in which the corresponding propositions of the class Ci are true” (W4, p. 166). The binary relation here is thus a truth-functional implication. Moreover, the remarks in his Logic Notebook published in W4 (p. 216) were written in the same year and appear to be the first instance presenting variables v and f to denote the truth values true and false.

(2) A dash over a symbol is used to denote a negative of the symbol. A dash over the sign of illation in Pi illation symbol Ci  indicates the class complement. Constants ∞ and 0 are taken to mean the values of the possible and the impossible. An important modal component which Peirce would develop later on is thus emerging in this work.

(3) The totality of all that is possible is according to Peirce the “universe of discourse, and may be very limited,” that is, limited to that which “actually occurs,” rendering “everything which does not occur” impossible (W4, p. 170). The important idea of working with variable and restricted domains makes a marked difference not only to Frege who is well-known to have his logic to quantify over the entire “logical thought”, but also to Schröder, who though also working on the algebra of logic had nonetheless rendered Peirce’s 1885 (see below) algebra of logic so as to quantify, in a Fregean fashion, what Peirce had later in 1903 remarked to be “the whole universe of logical possibility” (MS 478, pp. 163-4).

(4) A new operation on relatives, which Peirce termed transaddition (º) is then introduced (W4, p.204). Taking two relatives, such as being lover of___ (l) and being servant of___ (s), their relative product ls denotes whatever is lover of a servant of___. Their transaddition l º s denotes whatever is not a lover of everything but servants of___, that is, it denotes a complement of the complements of the relative product of the two terms l and s.

(5) The 1880 paper is also the first in which the idea of a relative sum, which is the complement of the transaddition and which Peirce in 1882 will denote by the dagger (†), is employed. For example, ls reads lover of everything but servants of___. Hence this 1880 paper marks a decisive move towards a theory of quantification which will see its emergence in his 1883 Note B in the Studies in Logic (see below) and which comes to be completed in his 1885 “Algebra of Logic” paper (see below).

(6) The 1880 paper also suggests a mathematical theory of lattices for the treatment of the algebra of logic (W4, pp. 183-188).

Arthur Prior (1958, 1964) showed that Peirce’s 1880 paper provides a complete basis for propositional logic.

1880b: In an unpublished manuscript (MS 378, Peirce 1880b) entitled “A Boolian Algebra with One Constant”, which still in 1926 was tagged “to be discarded” at the Department of Philosophy at Harvard University, Peirce reduces the number of logical operations to one constant. He states that “this notation … uses the minimum number of different signs … shows for the first time the possibility of writing both universal and particular propositions with but one copula” (W4, p.221). Peirce’s notation was later termed the Sheffer stroke, and is also well-known as the NAND operation, in Peirce’s terms the operation by which “[t]wo propositions written in a pair are considered to be both denied” (W4, p.218). In the same manuscript, Peirce also discovers what is the expressive completeness of the NOR operation, indeed today rightly recognized as the Peirce arrow.

1881: “On the Logic of Number” (Peirce 1881), published in American Journal of Mathematics and read before the National Academy of Sciences, was noted by Gerrit Mannoury (1909, pp. 51, 78) to be the first successful axiomatization of natural numbers. Shields (1981/2012) has shown Peirce’s axiom system to be equivalent to the better-known systems of Dedekind (1888) and Peano (1889). Peirce’s paper formulates, presumably for the first time, the notions of partial and total linear orders, recursive definitions for arithmetical operations, and the general definition of cardinal numbers in terms of ordinals. The paper also provides a purely cardinal definition of a finite set (Dedekind-finite) by checking whether De Morgan’s syllogism of transposed quantity is valid. [The syllogism of transposed quantity is expressed in the following mode of inference: Every Texan kills a Texan; Nobody is killed by but one person; Hence, every Texan is killed by a Texan.] Peirce then derives in this paper the latter property of finiteness from the ordinal one. Doing the converse assumes the axiom of choice.

During 1881-2, Peirce edited a book, published in 1883 and entitled Studies in Logic by Members of the Johns Hopkins University (Peirce 1883). It contained significant graduate work by his students Benjamin I. Gilman, Christine Ladd(-Franklin), Allan Marquand and Oscar Howard Mitchell. Peirce contributed to the volume a paper “A Theory of Probable Inference”, together with Note A, “A Limited Universe of Marks”, and Note B, “The Logic of Relatives.” Some developments in Mitchell’s paper as well as in Note B are worth highlighting.

Mitchell’s “On a New Algebra of Logic” was hailed by his teacher as “one of the greatest contributions that the whole history of logic can show” (MS 492; LoF, p. 225). Peirce attributed to Mitchell two major discoveries: first, the invention of the basic form of proof transformation and second, the interpretation of quantifiers in multiple dimensions, one of which is time. The former is similar to the resolution rule in logic programming, and consists of a series of insertions (by adding to premises) and erasures (by elimination of consequents). In Peirce’s words, “the passage from a premiss or premisses … to a necessary conclusion in the manner to which is alone usually called necessary reasoning, can always be reached by adding to the stated antecedents and subtracting from stated consequents, being understood that if an antecedent be itself a conditional proposition, its antecedent is of the nature of a consequent” (MS 905; LoF, pp. 731-732). The latter discovery—universes in multiple dimensions—has its correlate in the idea of interpreted domains and in the modern notion of temporal logics and many-sorted quantification. It can also be seen as a development of new languages that take the role of indices in the quantifiers to be mappings from contexts to values in universes of discourse. Having in mind Mitchell’s pioneering idea of logical dimensions, Peirce goes on to mention that the study of Mitchell’s paper was for him necessary in order to break “ground in the gamma [modal logic] part of the subject” of existential graphs (MS 467; LoF, p. 332; see below). Years later, Peirce defines the term “dimension” in the Dictionary of Philosophy and Psychology by noting that it is

an element or respect of extension of a logical universe of such a nature that the same term which is individual in one such element of extension is not so in another. Thus, we may consider different persons as individual in one respect, while they may be divisible in respect to time, and in respect to different admissible hypothetical states of things, etc. This is to be widely distinguished from different universes, as, for example, of things and of characters, where any given individual belonging to one cannot belong to another. The conception of a multidimensional logical universe is one of the fecund conceptions which exact logic owes to O. H. Mitchell. Schröder, in his then second volume, where he is far below himself in many respects, pronounces this conception ‘untenable’. But a doctrine which has, as a matter of fact, been held by Mitchell, Peirce, and others, on apparently cogent grounds, without meeting any attempt at refutation in about twenty years, may be regarded as being, for the present, at any rate, tenable enough to be held. (DPP 2, p. 27)

Mitchell develops, for the first time, the idea and the notation for existential and universal quantifiers, and notices that it is by from alternations of these quantifiers that logic derives its expressive power. Peirce testifies in the same dictionary entry that placing Σ and Π in alternating orders “was probably first introduced by O. H. Mitchell in his epoch-making paper” (DPP 2, p. 650). However, being limited to monadic predicates, Mitchell’s language was deprived of some expressive power.

Having supervised and perused Mitchell’s paper, in Note B of the Studies in Logic, Peirce generalizes the groundwork Mitchell had laid on the theory of quantification. His theory of relatives adds indices as individual variables to the operators Σ and Π to denote individual objects. Relative products and relative sums are then defined as (lb)ij = Σx(l)ix(b)xj and (l † b)ij = Πx {(l)ix + (b)xj}, thus becoming species of existential and universal quantification: the lover of a benefactor is “a particular combination, because it implies the existence of something loved by its relate and a benefactor of its correlate.” The lover of everything but benefactors is “universal, because it implies the non-existence of anything except what is either loved by its relate or a benefactor of its correlate” (Peirce 1883, p. 189). Peirce had already had the relative sum at his disposal and the idea of it expressing the non-existence of exceptions naturally led to its dual of the existential quantification. Towards the end of Note B Peirce writes something is a lover of something as Σi Σaj lij, everything is a lover of something as Πi Σj lij, there is something which stands to something in the relation of loving everything except benefactors of it as Σi Σk Πj (lij + bjk), and so on. Taking α to denote accuser to___of___, ε excuser to___of___, and π preferrer to___of___, Πi Σj Σk (α)ijkjki + πkij) means that “having taken any individual i whatever, it is always possible so to select two, j and k, that i is an accuser to j of k, and also is either excused by j to k or is something to which j is preferred by k” (Peirce 1883, p. 201). The phrasing Peirce uses here (such as “having taken any individual”, “it is always possible so to select”) is indicative of a new semantic treatment of quantifiers and sequences of quantifiers which he goes on to pursue further in later papers, and which in Hilpinen (1982), Hintikka (1996) and Pietarinen (2006) have shown to agree with game-theoretic semantics. Interestingly, Peirce’s examples are all stated in prenex normal form, which highlights the idea of sequences of dependent quantifiers. Peirce’s quantifiers bind variables ranging over interpreted domains. In this 1883 paper, he provides the basic inference rules, such as Σi Πj  illation symbol Πj Σi, for manipulating the strings of quantifiers. The language is not inductively defined, it lacks notation for functions, and it uses neither constants nor an equality sign, but in other respects it coincides with that of first-order predicate calculus.

Alfred Tarski’s summary concerning Peirce’s contributions to the logical theory of relatives is illuminating:

[t]he title of creator of the theory of relations was reserved for C. S. Peirce. In several papers published between 1870 and 1882, he introduced and made precise all the fundamental concepts of the theory of relations and formulated and established its fundamental laws. Thus Peirce laid the foundation for the theory of relations as a deductive discipline; moreover he initiated the discussion of more profound problems in this domain. In particular, his investigations made it clear that a large part of the theory of relations can be presented as a calculus which is formally much like the calculus of classes developed by G. Boole and W. S. Jevons, but which greatly exceeds it in richness of expression and is therefore incomparably more interesting from the deductive point of view. (Tarski 1941, p. 73)

However, it is his 1885 theory of quantification that Peirce calculated to settle the problems of deductive logic and logical analysis in a way that decidedly brought him beyond the algebraic approach to the logic of relatives.

1885: Peirce’s logic of quantifiers comes to a full blossom in his paper, written in summer 1884, “On the Algebra of Logic: A Contribution to the Philosophy of Notation”, and published in the American Journal of Mathematics in the following year (Peirce 1885). This massive paper defies any condensed exposition; but in summary, it contains Peirce’s “five icons of algebra” as a system of natural deduction based on introduction and elimination rules. Peirce had repeatedly stated that his having supervised and examined Mitchell’s paper was essential in order to arrive at the idea of these two basic operations. There is an abundant use of truth-functional propositions and an anticipation of the truth-table method to test tautologies. One of the examples comes close to the tableaux method, later proposed by Evert Beth and Jaakko Hintikka, that spells out a systematic search for counter-models by deriving contradictions from the negations of the formula to be proved. In order “to find whether a formula is necessarily true,” he says, “substitute f and v for the letters and see whether it can be supposed false by any such assignment of values” (Peirce 1885, p. 224; Pietarinen 2006; Anellis 2012a).

When he moves on to the first-order (“first-intentional”) logic, Peirce seeks to devise a notation that is as iconic as possible, building on his semiotic insight that the more iconic a notation is, the better suited it would be for logical analysis. He starts by using “Σ for some, suggesting a sum, and Π for all, suggesting a product” (1885, p. 180). Once again, Peirce credits Mitchell, now for the method of separating the “quantifying part”—which he later termed the “Hopkinsian” to honour its place of discovery (MS 515, 1902)—from the pure Boolean expression: the latter refers to an individual by its use of indices (like pronouns in language) while the former states what that individual is. The quantifying operators are, however, “only similar to a sum and product,…because the individuals of the universe may be denumerable” (1885, p. 180). Peirce’s consideration illustrates similar lines of thought as those that prompted Löwenheim to formulate his famous 1915 theorem: if a first-order sentence has a model then it has also a countable model, or generally, models for sets of formulas being of some cardinality imply models of some other infinite cardinality (Badesa 2004). (Associating infinite products and sums with conjunctions and disjunctions was what Wittgenstein took to be his own biggest mistake in logic.) The 1885 paper continues introducing rules for quantifier manipulation, including “putting the Σs to the left, as far as possible” (1885, p. 182), which is a prelude to the idea of Skolem normal forms. One could say that it is the sequences of quantifiers, especially those of dependent quantifiers, that contribute to a linear logic notation as being maximally iconic, and that it is the prenex and Skolem normal forms that bring out maximal analyticity which logical icons exploit. The 1885 paper then presents many examples drawn from natural language to be analyzed logically with this new notation. The paper also extensively deals with issues having to do with the representation of mathematical notions such as one-to-one correspondence and identity in the second-intentional logic, developed in the third part of the paper, and in which variables range over relations. There is an early attempt at axiomatizing set theory as well as some profound philosophical consideration on the possibility of developing a “method for the discovery of methods in mathematics,” which is to be based on these new approaches that aim at formulating a general theory of deductive logic.

Thanks to the volumes that have appeared in the Chronological Edition of the Writings between 1982 and 2010, and which have covered Peirce’s work up to 1892, these earlier phases of Peirce’s deductive logic are now relatively well understood. But the research from that point on has been hampered by the unavailability of systematic editions concerning Peirce’s later logical writings. Yet the mid-1890s marks only the beginnings of a new and by far the most productive era in Peirce’s logical investigations, which were to last until the last months of his life. This situation has by no means been adequately reflected in the secondary literature.

Although Peirce would continue his investigations on the algebra of logic throughout his life, the algebraic element would no longer assume a central position in his overall oeuvre:

In 1895 Schröder published the third huge volume of his logic, which consisted mainly of a vast elaboration in detail of the logical algebra of my Note B. That I never considered that algebra to be a great masterpiece is sufficiently shown by my giving my exposition of it no other title than “Note B.” The perusal of Schröder’s book convinced me that the algebra was not what was wanted, and in the Monist for January 1897 I produced a system of graphs which I now term Entitative Graphs. I shortly after abandoned that and took up Existential Graphs” (MS 467; LoF, p. 332).

Although it was Schröder’s elaboration that was to influence the works of the early model theorists such as Löwenheim and Skolem (Brady 2000), it was Peirce’s and Mitchell’s works that germinated the concept of first-order statements being true-in-a model (Pietarinen 2006; Bellucci & Pietarinen 2015a). Moreover, Peirce’s incessant hunt for new logical notations and methods was much more ambitious and philosophical than his early algebraic investigations revealed.

What was to take the place of algebra were the ideas that emerged from diagrammatic, iconic and topological considerations on logical representation and reasoning. These considerations were at first prompted by logical analogues to algebraic invariants in chemistry first developed by Peirce’s John Hopkins colleague J. J. Sylvester (1878) and investigated in Kempe (1886). Peirce was initially fascinated by the analogy in which a chemical atom is like a relative “in having a definite number of loose ends or “unsaturated bonds”, corresponding to the blanks of the relative” (CP 3.469, 1897). But the continual search for better and better notations for the overall purposes of logical analysis would also reveal the reasons why Peirce had to overcome this analogy between logic and chemistry.

Peirce’s theory of Existential Graphs (EGs), first conceived in summer 1896 and developed in subsequent years (for example Peirce 1897, 1906), was in part motivated by his need to respond to the expressive insufficiency and lack of analytic power of the systems described in his Note B, which he later termed the algebra of dyadic (dual) relatives, and in the 1885 general (universal) algebra of logic. The analytic power comes from the idea of subsuming what the algebraic operations do when composing concepts under one mode of composition. This composition of concepts is effected in the theory of EGs by the device of ligatures. A ligature is a complex line, composed of what Peirce terms the lines of identities, which connects various parts and areas of the graphs: [See e.g. Zeman 1964; Roberts 1973; Shin 2002; Dipert 2006; Pietarinen 2005, 2006, 2011, 2015a.]
figure 1

Fig. 1 

Peirces_Logic_Figure2

Fig. 2 

Peirces_Logic_Figure3

Fig. 3

The meaning of these lines is that the two or more descriptions apply to the same thing. For example, in Figure 2 there is a horizontal line attached to the predicate term “is obedient.” It means that “something exists which is obedient.” There is also another line which connects to the predicate term “is a catholic,” and that composition means that “something exists which is a catholic”, which is equivalent to the graph-instance in Figure 3. Since in Figure 1 these two lines are in fact connected by a continuous line, the graph-instance in Figure 1 means that “there exists a catholic which is obedient,” that is, “there exists an obedient catholic.” Ligatures, representing continuous connections composed of two or more lines of identities, stand for quantification, identity and predication, all in one go.

These EGs are drawn on a sheet of assertion that represents what the modeller knows or what mutually has been agreed upon to be the case by those who undertake the investigation of logic. The sheet thus represents the universe of discourse. The graph that is drawn on the sheet puts forth an assertion, true or false, that there is something in the universe to which it applies. This is the reason why Peirce terms these graphs existential. Drawing a circle around the graph, or alternatively, shading the area on which the graph-instance rests, means that nothing exists of the sort of description intended. In Figure 4, the assertion “something is a catholic” is denied by drawing an oval around it and thus severing that assertion from the sheet of assertion:


Fig. 4

Fig. 4

The graph-instance depicted in Figure 4 thus means that “something exists that is not catholic.”

Peirce aimed at a diagrammatic syntax that would use a minimal number of logical signs but at the same time be maximally expressive and as analytic as possible. His ovals, for instance, have different notational functions: “The first office which the ovals fulfill is that of negation. […] The second office of the ovals is that of associating the conjunctions of terms. […] This is the office of parentheses in algebra” (MS 430, pp. 54-56, 1902). The ovals are thus not only the diagrammatic counterpart to negation but also serve to represent the compositionality of a graph-formula. He held (MS 430, 1902; MS 670, 1911) that a notation that does not separate the sign of truth-function from the representation of its scope is more analytic than some other notation, such as that of an ordinary “symbolic” language, where such a separation is needed to force a one-dimensional notation. The role of ovals as denials is in fact a derived function from more primitive considerations of inclusion and implication (Bellucci & Pietarinen 2015; MS 300, 1908).

As to expressivity, Peirce had already recognized that the notion of dependent quantification was essential in any system expressive enough to serve the purposes of logical analysis of any assertions. The nested system of ovals in EGs effectuate this in a natural way, much in contrast to algebras that resort to an explicit use of parentheses and other punctuation devices. For example, the graph in Figure 5 means that “Every Catholic adores some woman.” The graph in Figure 6 means that “Some woman is adored by every Catholic.” Peirce notes that the latter asserts more since it states that all Catholics adore the same woman, whereas the former allows different Catholics to adore different women.

Fig. 5

Fig. 5        

Peirces_Logic_Figure6

Fig. 6  

The graph in Figure 7 means that “anything whatever is unloved by something that benefits it,” that is, “everything is benefitted by something or other that does not love it”:


Fig. 7

Fig. 7

Lastly, Figure 8 provides an example of a very complex graph taken from MS 504 (1898):

 

Fig. 8

Fig. 8

Peirce provided the meaning in natural language this way:

Every being unless he worships some being who does not create all beings either does not believe any being (unless it be not a woman) to be any mother of a creator of all beings or else he praises that woman to every being unless to a person whom he does not think he can induce to become anything unless it be a non-praiser of that woman to every being.

It is on the level of semantics where the power of dependent quantification comes to the fore. Peirce carried the semantics out in terms of defining the basics of what today is recognized as two-player zero-sum semantic games. For Peirce these games take place between the Graphist/Utterer and the Grapheus/Interpreter. [Sometimes, especially in Peirce’s model-building games, these roles split so that the Grapheus and the Interpreter are playing separate roles, see Pietarinen 2013.] Peirce’s semantic games were not limited to EGs; he applied the same idea also to interpret quantificational expressions and connectives in his general algebra of logic.

It speaks to the superiority of EGs over algebraic systems that in it deduction, following Mitchell’s work, is reduced to a minimum number of permissive operations. Peirce termed these operations illative rules of transformation, and in effect they consist only of two: insertions (permissions to draw a graph-instance on the sheet of assertion) and erasures (permissions to erase a graph-instance from the sheet). More precisely, the oddly enclosed areas of graphs (areas within an odd number of enclosures) permit inserting any graph on that area, while evenly enclosed areas permit erasing any graph from that area. A copy of a graph-instance is permitted to be pasted on that same area or any area deeper within the same nest of enclosures (the rule of iteration), and a copy thus iterated is permitted to be erased (the converse rule of deiteration). An interpretational corollary is that the double enclosure with no intervening graphs in the middle area can be inserted and erased at will.

A more detailed exposition of these illative rules of transformation would need to show their application to quantificational expressions, namely applying insertions and erasures to ligatures. A flavor of such proofs is given by inspecting the Figures 1, 2 and 3: an application of a permissible erasure on the line of identity in Figure 1 amounts to the graph-instance in Figure 2, and that another application of a permissible erasure on the upper part of the graph-instance in Figure 2 amounts to the graph-instance depicted in Figure 3. Thus what is represented in Figure 2 is a logical consequence of the graph-instance in Figure 1, and what is represented in Figure 3 is a logical consequence of the graph-instance given in Figure 2.

Roberts (1973) was the first to prove that these transformation rules, first given by Peirce in 1898, form a semantically complete system of deduction. Roberts did not mention, however, that Peirce had demonstrated their soundness in 1898 and again in 1903 and that he had argued for their completeness in terms of what he termed the “perfect archegetic rules of transformation” in the unpublished parts of the Syllabus for the Lowell Lectures that Peirce delivered in 1903.

The polarity of the outermost ends or portions of ligatures determines whether the quantification is existential (that end or portion resting on even/positive areas) or universal (if it rests on odd/negative area). Unlike in the Tarski-type semantics, but just as what happens in game-theoretic semantics, the preferred rule of interpretation of the graphs is what Peirce termed “endoporeutic”: one looks for the outermost portions of ligatures on the sheet of assertions first, assigns semantic values to that part, and then proceeds inwards into the areas enclosed with ovals. In non-modal contexts, ligatures are not well-formed graphs because they may cross the enclosures.

The diagrammatic nature of EGs consists in the iconic relationship between forms of relations exhibited in the diagrams and the real relations in the universe of discourse. Peirce was convinced that, since these graphical systems exploit a proper diagrammatic syntax, they—together with any of their extensions that would be introduced to cover modalities, non-declarative expressions, speech acts, and so forth—can express any assertion, however intricate. Guided by the precepts laid out by the diagrammatic forms of expression, and together with the simple illative permissions by which deductive inference proceeds, the conclusions from premises can be “read before one’s eyes”; these graphs present what Peirce believed is a “moving picture of the action of the mind in thought” (MS 298; LoF, p. 655; late 1906-1907).

If upon one lantern-slide there be shown the premisses of a theorem as expressed in these graphs and then upon other slides the successive results of the different transformations of those graphs; and if these slides in their proper order be successively exhibited, we should have in them a veritable moving picture of the mind in reasoning. (MS 905; LoF, p. 723; late 1907-1908)

The theory of EGs that uses only the notation of ovals and the spatial notion of juxtaposition of graphs is termed by Peirce the Alpha part of the EGs, and it corresponds to propositional logic. The extension of the alpha part with ligatures and rhemas (also termed spots by Peirce) gives rise to the Beta part, and it corresponds to fragments of first-order predicate calculus. What Peirce termed the Gamma part was a boutique of a number of developments, including various modalities such as metaphysical, epistemic and temporal modalities, as well as extensions of such graphs with ligatures. In Peirce’s writings, there are developments of graphical systems for higher-order logics and abstraction (Peirce’s “logic of potentials”), the logic of collections, and investigation of meta-logical expressions that use the language of graphs to talk about notions and properties of the graphs in that language (Peirce’s “graphs of graphs”). He mentions late in 1911 that the Delta part would also need to be added, most likely because of the ever-expanding systems that had been mushrooming in the Gamma part.

Peirce’s further contributions to deductive logic. While the development of the theory of the logic of existential graphs was his chief contribution, Peirce’s other contributions to the development of modern logic were numerous. In the Logic Notebook (1909) he defined a number of operations for three-valued logic and gave semantics for them in terms of defining truth-tables for such new connectives (Fisch & Turquette 1966). In these systems, which he called triadic logic, the third value is “the limit” between “true” and “not true,” and it applies to what Lane (1999) has identified as boundary-propositions: in Peirce’s terms, boundary-propositions have “a lower mode of being” which can “neither be determinately P, nor determinately not-P,” but are “at the limit between P and not P” (MS 399, p. 344r, 1909). Peirce defined several connectives to realize this idea in alternative ways, including four one-place connectives which were later reinvented as strong negation, two Post negations and the Tertium function, as well as six two-place connectives, including one that pertains to the logic of ordinary discourse.

Generally, Peirce divided deduction in two: on the one hand, deduction is either necessary or probable (deductive reasoning about probabilities), and on the other hand, deduction is either corollarial or theorematic. Corollarial deduction is reasoning “where it is only necessary to imagine any case in which the premisses are true in order to perceive immediately that the conclusion holds in that case.” Theorematic deduction “is deduction in which it is necessary to experiment in the imagination upon the image of the premiss in order from the result of such experiment to make corollarial deductions to the truth of the conclusion” (MS L 75, 1902). He considered the theorematic/corollarial distinction his first real discovery in the philosophy of mathematics. Theorematic deductions can be of different kinds and degrees of complexity, and he took the classification of various types of theorematic deductions to be of the utmost value in the theory of logic (MS 617; MS 201; Peirce 1908). Stjernfelt (2014) proposes a new classification of theorematic inferences. Hintikka (1980) has argued that reasoning is theorematic if it increases the number of layers of quantifiers, and that an argument is the more theorematic the more new individuals are used in it (see also Ketner 1985; Zeman 1986; Hoffmann 2010).

Zooming into some of the details of Peirce’s systems of logic, including those of diagrammatic logics, one finds a treasury of developments the meaning of which is only beginning to unravel over a century later (Bellucci, Pietarinen & Stjernfelt 2014). In 1886, Peirce suggested in a letter to his former student Allan Marquand, who had designed mechanical logic machines for syllogistic reasoning, that “it is by no means hopeless to expect to make a machine for really difficult problems. But you would have to proceed step by step. I think electricity would be the best thing to rely on” (L 269, Peirce to Marquand, 30 December, 1886; W5, p. 422). He then showed how switching circuits can be connected serially and in parallel, noting that these two configurations correspond to multiplication (algebraic sum as logical disjunction) and addition (algebraic product as logical conjunction) in logic. In addition to the idea of real logical machines running on electricity, Peirce was also very interested in the philosophical question of whether living intelligence is required in performing deductive reasoning, an issue of continuing relevance to A.I. and to the prospects of automatized theorem proving. In 1902 he developed two notational systems with sixteen binary connectives to map out all of the possible truth functions of the binary propositional calculus (Clark 1997; Zellweger 1997). According to Max Fisch, “No other logician compares with Peirce in attention to systems of notation and to sign-creation” (Fisch 1982, p. 132). Peirce’s work on these notational systems foresaw geometrical structures of logic, including spaces revealed by the study of the geometry of negation and other operators. Based on Peirce’s conceptual and sign-theoretic considerations, an apparatus for displaying and performing a complete set of the sixteen binary connectives in a two-valued propositional logic was patented in the U.S.A. in 1981 by Shea Zellweger.

Peirce also worked on early forms of topology (Havenel 2010), including studies on what might be recognized as rudimentary versions of homologies and knots, in his attempts to find pathways not only to logical issues but also to questions in philosophy of mathematics (Murphey 1961; Moore 2010).

Moreover, his diagrammatic systems of modal logic included suggestions for defining several types of multi-modal logics in terms of tinctures of areas of graphs. Tinctures enable logic to assert, among others, modalities including necessities and metaphysical possibilities, and so call for changes in the nature of how the corresponding logics behave, including the identification of individuals at the presence of multiple universes of discourses. He defined epistemic operators in terms of subjective possibilities, which, just as in contemporary epistemic logic, are epistemic possibilities defined as duals of knowledge operators. He analyzed the meaning of identities between actual and possible objects in quantified multi-modal logics. As an example, the two graphs given in Figures 9 and 10 that he presented in a 1906 draft of the Prolegomena paper (MS 292) illustrate the nature of the interplay between epistemic modalities and quantification.

Fig. 9

Fig. 9

Fig. 10

Fig. 10

The graph in Figure 9 is read “There is a man who is loved by one woman and loves a woman known by the Graphist to be another.” The reason is this. In the equivalent graph depicted in Figure 10 the woman who loves is denoted by the name A, and the woman who is loved is denoted by the name B. The shaded area is a tincture that refers to the modality of subjective possibility. Thus the graph in Figure 10 means that it is subjectively impossible, by which Peirce means that “it is contrary to what is known by the Graphist” (= the modeller of the graph), that A should be B. In other words, the woman who loves and the woman who is loved (whom the graph does not assert to be otherwise known to the Graphist) are known by the Graphist not to be the same person.

Peirce’s work highlights the philosophical significance of ideas that were rediscovered later and largely after the mid-twentieth century, though often in different clothes: in Peirce’s largely unpublished works one finds him addressing such topics as multi-modal logics and possible-worlds semantics, quantification into modal contexts, cross-world identities (in MS 490 he termed these special relations connecting objects in different possible worlds (for references, see Pietarinen 2005), cumulative and branching quantifiers (the latter being related to independence-friendly logic, see Pietarinen 2015b), as well as what later on became known as “Peirce’s Puzzle” (Dekker 2001; Hintikka 2011; Pietarinen 2015b), namely the question of the meaning of indefinites in conditional sentences, which Peirce himself analyzed in quantified modal extensions of EGs.

Far from merely anticipating later discoveries, thus, Peirce’s logic in general puts what later on came to be explored in the fields of philosophical logic, formal semantics and pragmatics, philosophy of logic, mathematics, mind and language, cognitive and computing sciences, and history and philosophy of science, into a systematic logico-semeiotic perspective. From time to time, his ideas even surpass stagnated contemporary discussions, especially in the philosophy of logic and mathematics. [See for example Bellucci, Pietarinen & Stjernfelt 2014; Lupher & Adajian 2015; Sowa 2006; Zalamea 2012a; Zalamea 2012b; PM. For further details on Peirce’s deductive logic, see the collection of Houser and others, eds. 1997. Hilpinen 2004 provides a useful overview.]

iv. Inductive Logic

In 1865 (W1, pp. 263-64) Peirce defines induction as inference from Case and Result to Rule. Its general form is:

Case: M1 M2 M3 Mare S
Result: M1 M2 M3 Mare P
Rule: Therefore, all S are P.

A certain number of objects (M1 M2 M3 M4 ), known to belong to a certain class (S), possess a certain character (P); therefore, it can be infered inductively that the whole class S possesses that character. I notice that neat, swine, sheep, and deer, which I know are cloven-hoofed, are herbivores. Therefore, I infer inductively that all cloven-hoofed animals are herbivores.

Later, Peirce came to divide induction into three principal kinds. Crude induction is the lowest form of induction, based upon the common practice of generalizing about future events on the ground of previous experience. For example, “No instance of a genuine power of clairvoyance has ever been established: So I presume there is no such thing”; “cancer is incurable, because every known case has proved to be so.” Its general form is “All observed As are B. Therefore, All As are B.” It is the weakest form of inductive reasoning in terms of security. Qualitative induction is the intermediate kind in terms of security. It is what Peirce had earlier called hypothetical reasoning or abduction. It consists in testing a hypothesis by sampling the possible predications that may be made on the basis of it (CP 7.216). Qualitative abduction is reasoning that tests hypotheses already formulated. It should not be confused with abduction, which is reasoning that originates new hypotheses. Quantitative induction is the highest form of induction in terms of security. It investigates the real probability that a member of a certain class will have a certain character. Its procedure consists in finding a representative sample of the class and noting the proportion of them that possess the character P. Then, the inference is drawn that the proportion holds for the whole class. Its logical form is

S1 S2 S3 S4 and so forth, are taken at random from the Ms.
The proportion p of S1 S2 S3 S4 is P.
Hence, probably and approximately, the same proportion p of the Ms are P.

The inversion of a quantitative induction gives us a statistical deduction, whose form is

The proportion p of the Ms are P.
S1 S2 S3 S4 and so forth, are taken at random from the Ms.
Hence, the proportion p of them is P.

Although crude, qualitative, and quantitative induction are different in kind, their justification is, according to Peirce, the same:

The validity of Induction consists in the fact it proceeds according to a method which though it may give provisional results that are incorrect will yet if steadily pursued, eventually correct any such error. […] all Induction possesses this kind of validity, and […] no Induction possesses any other kind that is more than a further determination of this kind. (MS 293, 1907)

The validity rests upon induction being self-corrective: in the long run induction is bound to lead us ever closer to the correct representation of reality. Its validity is therefore linked to esse in futuro, to the possibility of self-correction of the very method itself. Any actual induction that is performed may well be wrong or partly wrong, but it remains valid because its leading principle is valid, that is, is conducive to truth in the long run.

Peirce’s polemic target was a theory that would make the validity of induction rest upon some principles of uniformity or regularity in nature. According to Peirce, that was how John S. Mill and Philodemus of Gadara (ca.110–ca.30 B.C.E.) attempted, unsoundly, to justify induction. Of the several objections that Peirce raised from time to time against this way of justifying induction, one is worth reporting. Mill argues that a universe without any regularity is imaginable, and that in that universe inductions would be invalid. But the absence of uniformity, that is, the absence among certain objects S of the character P, is itself a uniformity. No universe is imaginable in which induction is not valid. According to Peirce, “even if nature were not uniform, induction would be sure to find it out, so long as inductive reasoning could be performed at all” (CP 2.775).

Cheng 1969, Goudge 1946, Merrill 1975 and Forster 1989 provide further details on Peirce’s inductive logic.

c. Methodeutic

The third branch of Peirce’s logic is methodeutic, which he also called speculative rhetoric. He defined it to be “the study of the proper way of arranging and conducting an inquiry” (MS 606, p. 17), depicting it as being “not so exact in its conclusions as is critical logic” (MS L 75, 1902) and as involving “certain psychological principles” (MS 633, 1909). But it nevertheless is a theoretical study and not an art. Methodeutic is based upon critics, and considers not what is admissible (logical validity) but what is advantageous (logical economy). It is a “theoretical study of advantages” (MS L 75, 1902).

Abduction is of special interest to methodeutic, because abduction is the only mode of inference that can initiate a scientific hypothesis. But being justifiable is not a sufficient property of good hypotheses:

Any hypothesis which explains the facts is justified critically. But among justifiable hypotheses we have to select that one which is suitable for being tested by experiment. (MS L 75, 1902)

Among critically equivalent hypotheses (that is, hypotheses that explain the facts), one should be able to select for testing those that are capable of experimental verification. [Being capable of experimental verification is in Peirce’s philosophy of science to be conceived in the wide sense, including mental experimentation and imaginative activities in our thoughts (Bellucci & Pietarinen 2015b). It is not the same thing as the empirical verification criterion of the positivists, which Peirce often criticized.] This is the core of Peirce’s philosophy of pragmati(ci)sm, which teaches that the whole meaning of a hypothesis is in its conceivable practical (that is, experienceable) effects; pragmaticism therefore is “nothing else than the question of the logic of abduction” (CP 5.196, 1903).

In turn, among “pragmatistically” equivalent hypotheses (that is, hypotheses that are capable of experimental verification) one should select those that in the sense of Peirce’s economy of research are the cheapest ones. His argument for the economic character of methodeutic is roughly as follows: the logical validity of abduction presupposes that nature be in principle explainable. This means that to discover is simply to expedite an event that would sooner or later occur. Therefore, the real service of a logic of abduction is of the nature of an economy. Economy itself depends on three factors: cost (of money, time, energy, thought), the value of the hypothesis itself, and its effects upon other projects and hypotheses (MS L 75, 1902; MS 690, CP 7.164-231, 1901).

Although primarily concerned with abduction, methodeutic also has an interest in deduction and induction. Theorematic deductions (see §2.b.iii) manifest peculiar logical steps that are abductive rather than deductive. In order to overcome the lack of critical instruments for the investigation of those steps, Peirce emphasizes the need to have an inventory and logical classification of valuable steps in the history of mathematics which would become part of a methodeutic of necessary reasoning (Peirce 1908, MS 200-201).

Peirce also considered the study of the properties of different logical and mathematical notations and symbolisms as belonging to the department of methodeutic. In this respect, he coined the maxim of the ethics of terminology and of notation:

The person who introduces a conception into science has both the right and the duty of prescribing a terminology and a notation for it; and his terminology and notation should be followed except so far as it may prove positively and seriously disadvantageous to the progress of science. If a slight modification is sufficient to remove the objection, a much greater one should be avoided. (MS 530, 1902)

Induction too has its methodological side. The methods of the three classes of inductions are all based on “samples,” and they all presuppose that the samples are representative of the class from which they are sampled: methodeutic should therefore teach methods of producing fair samplings. His own experimental work is exemplary in that it develops new statistical methods to ascertain that truly randomized samples are achieved and that fully blinded testing conditions are secured. He emphasizes the method of predesignation, which prescribes that the characters concerning which class is sampled are to be chosen beforehand so that the sampler would not be influenced by any agreement among the members of the sample (see Goudge 1946).

Other things Peirce considers to pertain to methodeutic include the principles of definition, the methods of classification in general, and the doctrine of the clearness of ideas.

Peirce’s logic, conceived as semeiotic, characterizes a broad philosophical, methodological and scientific area of investigation. Although the present article has exposed a number of developments in Peirce’s studies in deductive logic, the deductive part is only a fraction of the wider project of semeiotic, the theory and philosophy of signs, and the logic of science. From a contemporary perspective, deductive, formal, and mathematical logic may have become the mainstay of logic as such, but for Peirce other areas of logic, such as speculative grammar and the critics and methodeutic of abduction and induction, are at least as important as the deductive part of logic.

3. Peirce’s Logic in Historical Perspective

Peirce’s algebraic work in formal logic influenced Ernst Schröder (1841–1902), who drew heavily upon Peirce’s work in the three volumes of his Vorlesungen über die Algebra der Logik (Schröder 1890–1905). Peirce also successfully initiated a school in logic during his Johns Hopkins period (1879–1884), whose most evident manifestation is the richness and originality of the papers contained in the Studies in Logic (Peirce 1883). For the main part of his career, Peirce had been in contact and correspondence with the most prominent logicians, mathematicians and scientists of the time, and his works appeared in leading scientific journals and proceedings.

All these facts notwithstanding, the reception of Peirce’s deductive logic has been strangely erratic, even in the early days. Especially in his later period (1892–1914), Peirce worked virtually alone in an adverse environment and without much intellectual and material support. It is true that the recognition of his contributions has suffered from a long-term unavailability of his vast Nachlass of over 100,000 surviving pages of manuscripts and correspondence. In some cases at least, the explanation may be found in the unprecedented technical and mathematical standard and rigor characterizing his work. But what is certainly a chief reason behind the general neglect of Peirce’s logic is the rise, at the end of the 19th century, of what has later been named the Frege-Russell tradition in logic.

The historiography of logic seems to have accepted the idea, initially promoted by Bertrand Russell and subsequently canonized by historian of logic Jean van Heijenoort (1912–1986), of a “Fregean revolution” in logic. In this narrative, modern mathematical logic (also deceptively called symbolic logic) has replaced traditional or Aristotelian logic. According to such a picture, the work of the “algebraists,” (including Boole, De Morgan, Peirce and Schröder) belongs to the pre-Fregean logical paradigm.

Anellis (2012b) identified seven features of such a “Fregean” revolution: (1) A propositional calculus with a truth-functional definition of connectives, especially the conditional. (2) Decomposition of propositions into function and argument instead of into subject and predicate. (3) A quantification theory, based on a system of axioms and inference rules. (4) Definitions of infinite sequence and natural number in terms of logical notions (that is the logicization of mathematics). (5) Presentation and clarification of the concept of a formal system. (6) Relevance and use of logic for philosophical investigations (especially for philosophy of language). (7) Separating singular propositions, such as “Socrates is mortal” from universal propositions such as “All Greeks are mortal.” All these characteristics, Anellis argued, can be found in Peirce’s work, which therefore falls within the parameters of van Heijenoort’s conception of the Fregean revolution and the definition of mathematical logic. One also needs to remember that there are many characteristics of Peirce’s logic and philosophy of logic, vitally important to his logical vision, that either add to, modify or reject those that have been taken to typify the Fregean tradition. What may be ill-named as a Fregean revolution is found in a different, and perhaps more penetrating and consequential shape in Peirce’s work.

Peirce and Frege discovered quantificational theory around the same time (1879–1883). Frege’s work was at the time largely ignored. Russell credited Frege a posteriori with having founded modern logic in the Begriffsschrift (Frege 1879). However, while Frege’s notation was hardly ever used, the Peirce-Schröder notation was largely adopted by others. The important results of Löwenheim and Skolem at the beginning of the 20th century were presented in the Peirce-Schröder system without any trace of influence by Frege or Russell. Peano’s use of the existential and universal quantifiers derives from Schröder and Peirce, not from Frege. Unlike Frege, Peirce recognized the utmost importance of dependent quantifiers and experimented with that idea in various ways in the algebra of logic and in existential graphs, and proposed new systems and dimensions of quantification that involve independent quantification (MS 430). Peirce’s overall influence upon the development of modern logic was considerable though its nature and scope had remained ill-understood for a long time (Putnam 1982; Dipert 1995; Pietarinen 2015a).

Peirce’s philosophy of logic had no better fate. Aside from Josiah Royce and especially Lady Victoria Welby with whom Peirce corresponded on the logic of signs and semiotics during 1903-1910, Peirce’s radical idea of “logic as semeiotic” largely passed by unnoticed. In the 1930s Charles Morris took, misleadingly, Peirce’s trivium of speculative grammar, critics and methodeutic to correspond to the division of the study of language into syntax, semantics, and pragmatics (Morris 1938, pp. 21-22). Carnap (1942) adopted Morris’ trichotomy and made it popular. Peirce’s philosophy of signs has since been studied by semioticians, led by the pioneering explorations by Roman Jakobson and Umberto Eco (see Eco 1975; Jakobson 1977; Eco 1984). Other aspects of Peirce’s philosophy of logic, such as the distinction between corollarial and theorematic deduction, his ideas on diagrammatic reasoning, and the evolution of new logical notations and meanings, is gaining the interest, not only of logicians and historians of logic, but also of philosophers of science, cognitive scientists as well as many scholars, scientists, artists and practitioners looking for ways to overcome boundaries of narrow conceptions of logic, reasoning and representation, as well as the outdated 20th-century scientific methodologies that have characterized their respective fields. [See for example the 2014 Peirce Centennial Conference at Lowell as well as the Applying Peirce Conference series at Helsinki in 2007 and 2014, which have brought together scholars and scientists interested in Peirce’s thought virtually on any field of science.]

From the perspectives of the history and philosophy of modern logic, it may not be entirely right to talk in strict terms about the two traditions in logic, namely those of the algebraic and the symbolic ones. On the one hand, Peirce’s line of work in the algebra of logic led to the invention of a spectrum of methods in the semantic and model-theoretic tradition while the logic that Schröder, for example, preferred was to quantify over the entire universe and was thus at bottom a universalist one, thereby sharing the same preference as Frege. On the other hand, Peirce’s continuous search for new notations for the purposes of logical analysis and representation made what others may have considered to be the subject of symbolic notations really the subject of diagrams and icons. Algebraic notations were for Peirce iconic and often even very graphically so. What mattered to him was to remain clear of the significations of logical signs. Logical signs were to be interpreted in proper contexts and according to the purposes of investigation at hand. Thus, Peirce’s philosophy of logic stands in stark contrast to purely formal, mathematical and proof-theoretic approaches to logic, which do not care so much for signification. Peirce should accordingly be counted in the pragmatic, rather than just the semantic, tradition in philosophy of logic and language (Pietarinen 2006; Tiercelin 1991).

The famous van Heijenoort–Hintikka distinction between “logic as calculus” and “logic as a universal medium” is nonetheless instructive here (van Heijenoort 1967; Hintikka 1997; Peckhaus 2004). According to the former view of logic as calculus, methods and languages are many, they are reinterpretable according to the context and purposes at hand, and they admit of many and varying universes as well as modal and intensional considerations. The latter, universalist position means, in contrast, that there is one logic to “rule them all,” and so our thought is bounded by what that logic can express. Peirce fits squarely into the former camp. Here again it is not that all who worked on the algebra of logic would be members of that same camp (Schröder is a counterexample), or that all of those who in the literature have been tagged as formalists would share the universalist presuppositions (David Hilbert may serve as another kind of a counterexample). It may be one of the lessons of Peirce’s pragmaticism and the methodological pluralism which he exercised in his logic that one does not fix in advance what may in the future be considered to fall within the scope of logic.

4. References and Further Reading

  • Peirce’s works
  • 1867. An Improvement in Boole’s Calculus of Logic. Proceedings of the American Academy of Arts and Sciences 7, pp. 249-261.
  • 1870. Description of a Notation for the Logic of Relatives. Memoirs of the American Academy of Arts and Sciences 9, pp. 317-378.
  • 1880a. On the Algebra of Logic.  American Journal of Mathematics 3, pp. 15–57.
  • 1881. On the Logic of Number. American Journal of Mathematics 4, pp. 85-95.
  • 1883 (ed.). Studies in Logic by Members of the Johns Hopkins University. Boston: Little, Brown, and Co. 1883.
  • 1885. On the Algebra of Logic. A Contribution to the Philosophy of Notation. American Journal of Mathematics 7, pp. 197–202.
  • 1897. The Logic of Relatives. The Monist 7, pp. 161–217.
  • 1901-1902. Entries in Dictionary of Philosophy and Psychology, 3 vols, edited by Baldwin, James Mark. Cited as DPP followed by volume and page number.
  • 1906. Prolegomena to an Apology for Pragmaticism. The Monist 16, pp. 492–546.
  • 1908. Some Amazing Mazes. The Monist 18 (3), pp. 416-464.
  • 1931–1966. The Collected Papers of Charles S. Peirce, 8 vols., ed. by Hartshorne, C, Weiss, P. and Burks, A. W. Cambridge: Harvard University Press. Cited as CP followed by volume and paragraph number.
  • 1967. Manuscripts in the Houghton Library of Harvard University, as identified by Richard Robin, “Annotated Catalogue of the Papers of Charles S. Peirce,” Amherst: University of Massachusetts Press, 1967, and in “The Peirce Papers: A supplementary catalogue,” Transactions of the C. S. Peirce Society 7 (1971): 37–57. Cited as MS followed by manuscript number and, when available, page number.
  • 1976. The New Elements of Mathematics by Charles S. Peirce, 4 vols., ed. by Eisele, C. The Hague: Mouton. Cited as NEM followed by volume and page number.
  • 1982 – …. Writings of Charles S. Peirce: A Chronological Edition, 7 vols., ed. by. Moore, E. C., Kloesel, C. J. W. et al. Bloomington: Indiana University Press. Cited as W followed by volume and page number.
  • 2010. Philosophy of Mathematics: Selected Writings, ed. by M. E. Moore, Bloomington and Indianapolis, IN: Indiana University Press. Cited as PM.
  • 2015. Logic of the Future. Peirce’s Writings on Existential Graphs, ed. by A.-V. Pietarinen, Bloomington: Indiana University Press. Cited as LoF.
  • Other works
  • Anellis, I. 2012a. Peirce’s Truth-Functional Analysis and the Origin of the Truth Table. History and Philosophy of Logic 33, pp. 37–41.
  • Anellis, I. 2012b. How Peircean was the ‘Fregean’ Revolution in Logic? arXiv:1201.0353.
  • Badesa, C. 2004. The Birth of Model Theory: Löwenheim’s Theorem in the Frame of the Theory of Relatives, Princeton: Princeton University Press.
  • Bellucci, F. & Pietarinen, A.-V. (2014). New Light on Peirce’s Concept of Retroduction and Scientific Reasoning, International Studies in the Philosophy of Science 28(2), pp. 1-21.
  • Bellucci, F. & Pietarinen, A.-V. (2015). Existential Graphs as an Instrument of Logical Analysis. Part I: Alpha, to appear.
  • Bellucci, F., Pietarinen, A.-V. & Stjernfelt, F. eds. 2014. Peirce: 5 Questions. VIP/Automatic Press.
  • Boole, G. 1847. The Mathematical Analysis of Logic. Cambridge: Macmillan, Barclay, & Macmillan.
  • Boole, G. 1854. An Investigation of the Laws of Thought. Cambridge: Walton & Maberly.
  • Brady, G. 2000. From Peirce to Skolem. Amsterdam: Elsevier Science.
  • Brent, B. 1987. Charles S. Peirce. Logic and the Classification of the Sciences, Kingston/Montreal: MacGill-Queen’s University Press
  • Burch, R. W. 2011. Peirce’s 10, 28, and 66 Sign-Types: The Simplest Mathematics. Semiotica 184, pp. 93–98.
  • Burks, A. W. 1946. Peirce’s Theory of Abduction. Philosophy of Science 13, pp. 301-306.
  • Carnap, R. 1942. Introduction to Semantics, Cambridge, Mass: MIT Press.
  • Chauviré, Ch. 1994. Logique et Grammaire Pure. Propositions, Sujets et Prédicats Chez Peirce. Histoire Epistémologie Langage 16, pp. 137–175.
  • Cheng, C.-Y. 1969. Peirce’s and Lewis’s Theories of Induction, The Hague: Martinus Nijhoff.
  • Clark, G. 1997. New Light on Peirce’s Iconic Notation for the Sixteen Binary Connectives. In Houser and others 1997, pp. 304-333.
  • Dedekind, R. 1888. Was sind und was sollen die Zahlen. Braunschweig: Vieweg.
  • Dekker, Paul 2001. Dynamics and Pragmatics of ‘Peirce’s Puzzle’, Journal of Semantics 18, pp. 211-241.
  • De Morgan, A. 1847. Formal Logic. London: Taylor and Walton.
  • De Morgan, A. 1860. On the Syllogism IV; and on the Logic of Relations. Transactions of the Cambridge Philosophical Society 10, pp. 331-358.
  • Dipert, R. 1995. Peirce’s Underestimated Place in the History of Logic: A Response to Quine. In Ketner, K. L. ed. Peirce and Contemporary Thought. New York: Fordham University Press, pp. 32-58.
  • Dipert, R. 2006. Peirce’s Deductive Logic: Its Development, Influence, and Philosophical Significance. In: Misak, C. (ed.). The Cambridge Companion to Peirce. Cambridge: Cambridge University Press, pp. 287-324.
  • Eco, U. 1975. Trattato di semiotica generale. Milano: Bompiani.
  • Eco, U. 1984. Semiotica e filosofia del linguaggio. Torino: Einaudi.
  • Fann, K. T. 1970. Peirce’s Theory of Abduction. The Hague: Martinus Nijhoff.
  • Ferriani, M. 1987. Peirce’s Analysis of the Proposition: Grammatical and Logical Aspects. In Ferriani, M. & Buzzetti, D. (eds.), Speculative grammar, universal grammar and philosophical analysis of language. Amsterdam: Benjamins, pp. 149-172.
  • Fisch, M. H. 1982. The Range of Peirce’s Relevance, The Monist 65, pp. 123-141. Reprinted in Fisch 1986, pp. 422-448.
  • Fisch, M. H. 1986. Peirce, Semeiotic and Pragmatism. Ed. by K. L. Ketner and C. J. W. Kloesel, Bloomington: Indiana University Press.
  • Fisch, M. H. & Turquette, A. 1966. Peirce’s Triadic Logic. Transactions of the Charles S. Peirce Society 2, pp.71-85.
  • Forster, P. 1989. Peirce on the Progress and Authority of Science. Transactions of the Charles S. Peirce Society 25, pp. 421–452.
  • Frege, G. 1879. Begriffsschrift: eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: Louis Nebert.
  • Goudge, T. 1946. Peirce’s Treatment of Induction. Philosophy of Science 7, pp. 56-68.
  • Haack, S. 1993. Peirce and Logicism: Notes Towards an Exposition. Transactions of the Charles S. Peirce Society 29, pp. 33–56.
  • Havenel, J. 2010. Peirce’s Topological Concepts. In Moore 2010, pp. 283-322.
  • Hilpinen, R. 1982. On C. S. Peirceʼs Theory of the Proposition: Peirce as a Precursor of Game-Theoretical Semantics. The Monist 65, pp. 182-188.
  • Hilpinen, R. 1992. On Peirce’s Philosophical Logic: Propositions and Their Objects. Transactions of the Charles S. Peirce Society 28, pp. 467–488.
  • Hilpinen, R. 2004. Peirce’s Logic, in Gabbay, D.M., and J. Woods. 2004. Handbook of the History of Logic. Vol. 3: The Rise of Modern Logic From Leibniz to Frege. Vol. 3. Amsterdam: Elsevier North-Holland, pp. 611-658.
  • Hintikka, J. 1980. C. S. Peirce’s ‘First Real Discovery’ and Its Contemporary Relevance. Monist 63, pp. 304-315.
  • Hintikka, J. 1996. The Place of C. S. Peirce in the History of Logical Theory. In J. Brunning, J. & Forster, P. eds. The Rule of Reason: The Philosophy of Charles Sanders Peirce, Toronto: University of Toronto Press, pp. 13–33.
  • Hintikka, J. 1997. Lingua Universalis vs. Calculus Ratiocinator: An Ultimate Presupposition of Twentieth Century Philosophy. Dordrecht: Kluwer.
  • Hintikka, J. 2011. What the bald man can tell us. In: Biletzky, A. (ed.) Hues of Philosophy: Essays in Memory of Ruth Manor. London: College Publications.
  • Hoffmann, M. 2010. Theoric Transformations. Transactions of the Charles S. Peirce Society 46, pp. 570–590.
  • Houser, N. 1993. On ‘Peirce and Logicism’: A Response to Haack. Transactions of the Charles S. Peirce Society 29, pp. 57–67.
  • Houser, N., Roberts, D., Van Evra, J. eds. 1997. Studies in the Logic of Charles S. Peirce. Bloomington and Indianapolis: Indiana University Press.
  • Jakobson, R. 1977. A Few Remarks on Peirce, Pathfinder in the Science of Language. MLN 92, pp. 1026–1032.
  • Kapitan, T. 1992. Peirce and the Autonomy of Abductive Reasoning. Erkenntnis 37, pp. 1–26.
  • Kapitan, T. 1997. Peirce and the Structure of Abductive Inference. In Houser and others eds. 1997, pp. 477-496.
  • Kempe, A. B. 1886. A Memoir on the Theory of Mathematical Form. Philosophical Transactions of the Royal Society of London 177, pp. 1-70.
  • Ketner, K. L. 1985. How Hintikka Misunderstood Peirce’s Account of Theorematic Reasoning. Transactions of the Charles S. Peirce Society 21, pp. 407–418.
  • Lane, R. 1999. Peirce’s Triadic Logic Revisited. Transactions of the Charles S. Peirce Society 35, pp. 284–311.
  • Lewis, C. I. 1918.  A Survey of Symbolic Logic. Berkeley: University of California Press.
  • Lupher, T. and Adajian, T. eds. 2015. Philosophy of Logic: 5 Questions. Copenhagen: Automatic Press.
  • Ma, Minghui & Pietarinen, A.-V. 2015. A dynamic approach to Peirce’s interrogative construal of abductive reasoning, IFCoLog Journal of Logics and their Applications, in press.
  • Mannoury, G. 1909. Methodologisches und Philosophisches zur Elementar-Mathematik. Haarlem: P. Visser.
  • Merrill, D. D. 1997. Relations and Quantification in Peirce’s Logic, 1870-1885. In Houser et. al. eds. 1997, pp. 158-172.
  • Merrill, G. H. 1975. Peirce on Probability and Induction. Transactions of the Charles S. Peirce Society 11, pp. 90–109.
  • Moore, M. ed. 2010. New Essays on Peirce’s Mathematical Philosophy. Chicago: Open Court.
  • Morris, C.  W. 1938. Foundations of the Theory of Signs. In Morris, C. 1971. Writings on the General Theory of Signs. The Hague: Mouton.
  • Murphey, M. G. 1961. The Development of Peirce’s Philosophy, Cambridge, Mass: Harvard University Press, 2nd ed. 1993, Indianapolis: Hackett.
  • Paavola, S. 2004. Abduction as a Logic and Methodology of Discovery: The Importance of Strategies. Foundations of Science 9, pp. 267–283.
  • Peano, G. 1889. Arithmetices Principia. Nova Methodo Exposita. Torino: Bocca.
  • Peckhaus, V. 2004. Calculus Ratiocinator vs. Characteristica Universalis? The Two Traditions in Logic, Revisited. History and Philosophy of Logic 25, pp. 3-14.
  • Peirce, B. 1870. Linear Associative Algebra. Litographated ed., Washington D.C.
  • Pietarinen, A.-V. 2005. Compositionality, Relevance and Peirce’s Logic of Existential Graphs. Axiomathes 15, pp. 513-540.
  • Pietarinen, A.-V. 2006. Signs of Logic: Peircean Themes on the Philosophy of Language, Games and Communication. Dordrecht: Springer.
  • Pietarinen, A.-V. 2011. Existential Graphs: What a Diagrammatic Logic of Cognition Might Look Like. History and Philosophy of Logic 32, pp. 265–281.
  • Pietarinen, A.-V. 2013. Logical and Linguistic Games from Peirce to Grice to Hintikka (with comments by J. Hintikka). Teorema 33, pp. 121-136.
  • Pietarinen, A.-V. 2015a. Exploring the Beta Quadrant. Synthese 192, pp. 941-970.
  • Pietarinen, A.-V. ed. 2015b. Two Papers on Existential Graphs by Charles S. Peirce: 1. Recent Developments of Existential Graphs and their Consequences for Logic (MS 498, 499, 490, S-36, 1906), 2. Assurance through Reasoning (MS 669, 670, 1911). Synthese 192, pp. 881-922.
  • Pietarinen, A.-V. & Bellucci, F. 2015a. Habits of Reasoning: On the Grammar and Critics of Logical Habits. In D. E. West & M. Anderson eds. Consensus on Peirce’s Concept of Habit: Before and Beyond Consciousness. Dordrecht: Springer.
  • Pietarinen, A.-V. & Bellucci, F. 2015b. The Iconic Moment: Towards a Peircean Theory of Diagrammatic Imagination. In J. Redmond, A. N. Fernàndez, O. Pombo eds. Epistemology, Knowledge, and the Impact of Interaction, Dordrecht: Springer.
  • Prior, A. N. 1958. Peirce’s Axioms for Propositional Calculus. The Journal of Symbolic Logic 23, pp. 135–136.
  • Prior, A. N. 1964. The Algebra of the Copula. In Moore, E. & Robin, R. eds. Studies in the Philosophy of Charles Sanders Peirce. Amherst: The University of Massachusetts Press, pp. 79-94.
  • Putnam, H. 1982. Peirce the Logician. Historia Mathematica 9, pp. 290-301.
  • Roberts, D. D. 1973. The Existential Graphs of Charles S. Peirce. The Hague- Paris: Mouton.
  • Schröder, E. 1890-1905. Vorlesungen über die Algebra der Logik, 3 vols. Leipzig: Teubner.
  • Shields, P. 1981. Charles S. Peirce on the Logic of Number, 2nd ed. Boston: Docent Press, 2012.
  • Shin, S.-J. 2002. The Iconic Logic of Peirce’s Graphs. Cambridge, MA: MIT Press.
  • Short, T. L. 2007. Peirce’s Theory of Signs. Cambridge: Cambridge University Press.
  • Sowa, J. 2006. Peirce’s Contributions to the 21st Century, 14th International Conference on Conceptual Structures, Aalborg, Denmark, July 16-21, Lecture Notes in Computer Sceince, 4068, pp. 54-69.
  • Stjernfelt, F. 2014. Natural Propositions. The Actuality of Peirce’s Doctrine of Dicisigns, Boston: Docent Press.
  • Sylvester, J. J. 1878. On an Application of the New Atomic Theory to the Graphical Representation of the Invariants and Covariants of Binary Quantics. American Journal of Mathematics 1, pp. 64-104.
  • Tarski, A. 1941. On the Calculus of Relations. Journal of Symbolic Logic 6, pp. 73-89.
  • Tiercelin, C. 1991. Peirce’s Semiotic Version of The Semantic Tradition in Formal Logic. In New Inquiries into Meaning and Truth, N. Cooper & P. Engel eds. Harverest Wheatsheaf: St. Martin’s Press, pp. 187–210.
  • Van Heijenoort, J. 1967. Logic as Calculus and Logic as Language. Synthese 17, pp. 324–330.
  • Walsh, A. 2012. Relations between Logic and Mathematics in the Work of Benjamin and Charles S. Peirce. Boston: Docent Press.
  • Weiss, P. & Burks, A. 1945. Peirce’s Sixty-Six Signs. The Journal of Philosophy 42, pp. 383–388.
  • Zalamea, F. 2012a. Synthetic Philosophy of Contemporary Mathematics, Urbanomic.
  • Zalamea, F. 2012b. Peirce’s Logic of Continuity: A Mathematical and Conceptual Approach, Docent Press.
  • Zellweger, S. 1997. Untapped Potential in Peirce’s Iconic Notation for the Sixteen Binary Connectives. In Houser and others eds. 1997, pp. 334-386.
  • Zeman, J. 1964. The Graphical Logic of Charles S. Peirce, Ph.D. dissertation, University of Chicago.
  • Zeman, J. 1986. Peirceʼs Philosophy of Logic. Transactions of the Charles S. Peirce Society 22, pp. 1-22.

 

Author Information

Francesco Bellucci
Email: bellucci.francesco@gmail.com
Tallinn University of Technology
Estonia

and

Ahti-Veikko Pietarinen
Email: ahti-veikko.pietarinen@ttu.ee
Tallinn University of Technology
Estonia

Thomas S. Kuhn (1922—1996)

KuhnThomas Samuel Kuhn, although trained as a physicist at Harvard University, became an historian and philosopher of science through the support of Harvard’s president, James Conant. In 1962, Kuhn’s renowned The Structure of Scientific Revolutions (Structure) helped to inaugurate a revolution—the 1960s historiographic revolution—by providing a new image of science. For Kuhn, scientific revolutions involved paradigm shifts that punctuated periods of stasis or normal science. Towards the end of his career, however, Kuhn underwent a paradigm shift of his own—from a historical philosophy of science to an evolutionary one.

In this article, Kuhn’s philosophy of science is reconstructed chronologically. To that end, the following questions are entertained: What was Kuhn’s early life and career? What was the road towards Structure? What is Structure? Why did Kuhn revise Structure? What was the road Kuhn took after Structure? At the heart of the answers to these questions is the person of Kuhn himself, especially the intellectual and social context in which he practiced his trade. This chronological reconstruction of Kuhn’s philosophy begins with his work in the 1950s on physical theory in the Lowell lectures and on the Copernican revolution and ends with his work in the 1990s on an evolutionary philosophy of science. Rather than present Kuhn’s philosophy as a finished product, this approach endeavors to capture it in the process of its formation so as to represent it accurately and faithfully.

Table of Contents

  1. Early Life and Career
  2. The Road to Structure
    1. The Lowell Lectures
    2. The Copernican Revolution
    3. The Last Mile to Structure
  3. The Structure of Scientific Revolutions
  4. The Road after Structure
    1. Historical and Historiographic Studies
    2. Metahistorical Studies
    3. Evolutionary Philosophy of Science
  5. Conclusion
  6. References and Further Reading
    1. Kuhn’s Work
    2. Secondary Sources

1. Early Life and Career

Kuhn was born in Cincinnati, Ohio, on 18 July 1922. He was the first of two children born to Samuel L. and Minette (neè Stroock) Kuhn, with a brother Roger born several years later. His father was a native Cincinnatian and his mother a native New Yorker. Kuhn’s father, Sam, was a hydraulic engineer, trained at Harvard University and at Massachusetts Institute of Technology (MIT) prior to World War I. He entered the war, and served in the Army Corps of Engineers. After leaving the armed services, Sam returned to Cincinnati for several years before moving to New York to help his recently widowed mother Setty (neè Swartz) Kuhn. Kuhn’s mother, Minette, was a liberally educated person who came from an affluent family.

Kuhn’s early education reflected the family’s liberal progressiveness. In 1927, Kuhn began schooling at the progressive Lincoln School in Manhattan. His early education taught him to think independently, but by his own admission, there was little content to the thinking. He remembered that by the second grade, for instance, he was unable to read proficiently, much to the consternation of his parents.

Beginning in the sixth grade, Kuhn’s family moved to Croton-on-Hudson, a small town about fifty miles from Manhattan, and the adolescent Kuhn attended the progressive Hessian Hills School. According to Kuhn the school was staffed by left-oriented radical teachers, who taught the students pacifism. When he left the school after the ninth grade, Kuhn felt he was a bright and independent thinker. After spending an uninspired year at the preparatory school Solebury in Pennsylvania, Kuhn spent his last two years of high school at the Yale-preparatory Taft School in Watertown, Connecticut. He graduated third in his class of 105 students and was inducted into the National Honor Society. He also received the prestigious Rensselaer Alumni Association Medal.

Kuhn matriculated to Harvard College in the fall of 1940, following his father’s and uncles’ footsteps. At Harvard, he acquired a better sense of himself socially by participating in various organizations. During his first year, Kuhn took a yearlong philosophy course. In the first semester, he studied Plato and Aristotle; while in the second semester, he studied Descartes, Spinoza, Hume, and Kant. He intended to take additional philosophy courses but could not find time. He attended, however, several of George Sarton’s lectures on the history of science, but he found them boring.

At Harvard, Kuhn agonized over majoring in either physics or mathematics. After seeking his father’s counsel, he chose physics because of career opportunities. Interestingly, the attraction of physics or mathematics was their problem-solving traditions. In the fall of his sophomore year, the Japanese attacked Pearl Harbor and Kuhn expedited his undergraduate education by going to summer school. The physics department focused on teaching predominantly electronics, and Kuhn followed suit.

Kuhn underwent another radical transformation, also during his sophomore year. Although he was trained a pacifist the atrocities perpetrated in Europe during World War II, especially by Hitler, horrified him. Kuhn experienced a crisis, since he was unable to defend pacifism reasonably. The outcome was that he became an interventionist, which was the position of many at Harvard—especially its president, Conant. The episode left a lasting impact upon him. In a Harvard Crimson editorial, Kuhn supported Conant’s effort to militarize the universities in the United States. The editorial came to the attention of the administration, and eventually Conant and Kuhn met.

In the spring of 1943, Kuhn graduated summa cum laude from Harvard College with an S.B. After graduation, he worked for the Radio Research Laboratory located in Harvard’s biology building. He conducted research on radar counter technology, under John van Vleck’s supervision. The job procured for Kuhn a deferment from the draft. After a year, he requested a transfer to England and then to the continent, where he worked in association with the U.S. Office of Scientific Research and Development. The trip was Kuhn’s first abroad and he felt invigorated by the experience. However, Kuhn realized that he did not like radar work, which led him to reconsider whether he wanted to continue as a physicist. But, these doubts did not dampen his enthusiasm for or belief in science. During this time, Kuhn had the opportunity to read what he wanted; he read in the philosophy of science, including authors such as Bertrand Russell, P.W. Bridgman, Rudolf Carnap, and Philipp Frank.

After V.E. day in 1945, Kuhn returned to Harvard. As the war abated with the dropping of atomic bombs on Japan, Kuhn activated an earlier acceptance into graduate school and began studies in the physics department. Although Kuhn persuaded the department to permit him to take philosophy courses during his first year, he again chose the pragmatic course and focused on physics. In 1946, Kuhn passed the general examinations and received a master’s degree in physics. He then began dissertation research on theoretical solid-state physics, under the direction of van Vleck. In 1949, Harvard awarded Kuhn a doctorate in physics.

Although Kuhn had high regard for science, especially physics, he was unfulfilled as a physicist and continually harbored doubts during graduate school about a career in physics. He had chosen both a dissertation topic and an advisor to expedite obtaining a degree. But, he was to find direction for his career through Conant’s invitation in 1947 to help prepare a historical case-based course on science for upper-level undergraduates. Kuhn accepted the invitation to be one of two assistants for Conant’s course. He undertook a project investigating the origins of seventeenth-century mechanics, a project that would transform his image of science.

That transformation came, as Kuhn recounted later, on a summer day in 1947 as he struggled to understand Aristotle’s idea of motion in Physics. The problem was that Kuhn tried to make sense of Aristotle’s idea of motion using Newtonian assumptions and categories of motion. Once he realized that he had to read Aristotle’s Physics using assumptions and categories contemporary to when the Greek philosopher wrote it, suddenly Aristotle’s idea of motion made sense.

After this experience, Kuhn realized that he wanted to be a philosopher of science by doing history of science. His interest was not strictly history of science but philosophy, for he felt that philosophy was the way to truth and truth was what he was after. To achieve that goal, Kuhn asked Conant to sponsor him as a junior fellow in the Harvard Society of Fellows. Harvard initiated the society to provide promising young scholars freedom from teaching for three years to develop a scholarly program. Kuhn’s colleagues stimulated him professionally, especially a senior fellow by the name of Willard Quine. At the time, Quine was publishing his critique on the distinction between the analytic and the synthetic, which Kuhn found reassuring for his own thinking.

Kuhn began as a fellow in the fall of 1948, which provided him the opportunity to retool as a historian of science. Kuhn took advantage of the opportunity and read widely over the next year and a half in the humanities and sciences. Just prior to his appointment as a fellow, Kuhn was also undergoing psychoanalysis. This experience allowed him to see other people’s perspectives and contributed to his approach for conducting historical research.

2. The Road to Structure

a. The Lowell Lectures

In 1950, the trustee of the Lowell Institute, Ralph Lowell, invited Kuhn to deliver the 1951 Lowell lectures. In these lectures, Kuhn outlined a conception of science in contrast to the traditional philosophy of science’s conception in which facts are slowly accumulated and stockpiled in textbooks. Kuhn began by assuring his audience that he, as a once practicing scientist, believed that science produces useful and cumulative knowledge of the world, but that traditional analysis of science distorts the process by which scientific knowledge develops. He went on to inform the audience that the history of science could be instructive for identifying the process by which creative science advances, rather than focusing on the finished product promulgated in textbooks. Because textbooks only state the immutable scientific laws and marshal forth the experimental evidence to support those laws, they cover over the creative process that leads to the laws in the first place.

Kuhn then presented an alternative historical approach to scientific methodology. He claimed that the traditional position in which Galileo rejected Aristotle’s physics because of Galileo’s experiments is a fallacy. Rather, Galileo rejected Aristotelianism as an entire system. In other words, Galileo’s evidence was necessary but not sufficient; rather, the Aristotelian system was under evaluation, which also included its logic. Next, Kuhn proposed an alternative image of science based on the new approach to the history of science. He introduced the notion of conceptual frameworks, and drew from psychology to defend the advancement of science though scientists’ predispositions. These predispositions allow scientists to negotiate a professional world and to learn from their experiences. Moreover, they are important in organizing the scientist’s professional world and scientists do not dispense with them easily. Change in them represents a foundational alteration in a professional world.

Kuhn argued that although logic is important for deriving meaning and for managing and manipulating knowledge, scientific language—as natural—outstrips such formalization. He upended the tables on an important tool for the traditional analysis of science. By revealing the limitations of logical analysis, he showed that logic is necessary but insufficient for justifying scientific knowledge. Logic, then, cannot guarantee the traditional image of science as the progressive accumulation of scientific facts. Kuhn next examined logical analysis in terms of language and meaning. His position was that language is a way of dissecting the professional world in which scientists operate. But, there is always ambiguity or overlap in the meaning of terms as that world is dissected. Certainly, scientists attempt to increase the precision of their terms but not to the point that they can eliminate ambiguity. Kuhn concluded by distinguishing between creative and textbook science.

In the same year of the Lowell lectures, Harvard appointed Kuhn as an instructor and the following year as an assistant professor. Kuhn’s primary teaching duty was in the general education curriculum, where he taught Natural Sciences 4 along with Leonard Nash. He also taught courses in the history of science. And, it was during this time that Kuhn developed a course on the history of cosmology. Kuhn utilized course preparation for scholarly writing projects. For example, he handed out draft chapters of The Copernican Revolution to his classes.

A part of Kuhn’s motivation for developing a new image of science was the misconceptions of science held by the public. He blamed its misconceptions on introductory courses that stressed the textbook image of science as a fixed body of facts. After discussing this state of affairs with friends and Conant, Kuhn provided students with a more accurate image of science. The key to that image, claimed Kuhn, was science’s history, which displays the creative and dynamic nature of science.

b. The Copernican Revolution

In The Copernican Revolution, Kuhn claimed he had identified an important feature of the revolution, which previous scholars had missed: its plurality. What Kuhn meant by plurality was that scientists have philosophical and even religious commitments, which are important for the justification of scientific knowledge. This stance was anathema to traditional philosophers of science, who believed that such commitments played little—if any—role in the justification of scientific knowledge and relegated them to the discovery process.

Kuhn began reconstruction of the Copernican revolution by establishing the genuine scientific character of ancient cosmological conceptual schemes, especially the two-sphere cosmology composed of an inner sphere for the earth and an outer sphere for the heavens. For Kuhn, conceptual schemes exhibit three important features. They are comprehensive in terms of scientific predictions, there is no final proof for them, and they are derived from other schemes. Finally, to be successful conceptual schemes must perform logical and psychological functions. The logical function is expressed in explanatory terms, while the psychological function in existential terms. Although the logical function of the two-sphere cosmology continued to be problematic, its psychological function afforded adherents a comprehensive worldview that included even religious elements.

The major logical problem with the two-sphere cosmology was the movement and positions of the planets. The conceptual scheme Ptolemy developed in the second century guided research for the next millennium. But, problems surfaced with the scheme and predecessors could only correct it so far with ad hoc modifications. Kuhn asked at this point in the narrative why the Ptolemaic system, given its imperfection, was not overthrown sooner. The answer, for Kuhn, depended on a distinction between the logical and psychological dimensions of scientific revolutions. According to Kuhn, there are logically different conceptual schemes that can organize and account for observations. The difference among these schemes is their predictive power. Consequently, if an observation is made that is not compatible with a prediction the scheme must be replaced. But, before change can occur, there is also the psychological dimension to a revolution.

Copernicus had to overcome not only the logical dimension of the Ptolemaic system but also its psychological dimension. Aristotle had established this latter dimension by wedding the two-sphere cosmology to a philosophical system. Through the Aristotelian notion of motion among the earthly and heavenly spheres, the inner sphere was connected and depended on the outer sphere. The ability to presage future events linked astronomy to astrology. Such an alliance, according to Kuhn, provided a formidable obstacle to change of any kind.

But change began to take place, albeit slowly. From Aristotle to Ptolemy, a sharp distinction arose between the psychological dimensions of cosmology and the mathematical precision of astronomy. By Ptolemy’s time, astronomy was less concerned with the psychological dimensions of data interpretation and more with the accuracy of theoretical prediction. To some extent, this aided Copernicus, since whether the earth moved could be determined by theoretical analysis of the empirical data. But still, the earth as center of the universe gave existential consolation to people. The strands of the Copernican revolution, then, included not only astronomical concerns but also theological, economic, and social ones. Besides the Scholastic tradition, with its impetus theory of motion, other factors also paved the way for the Copernican revolution, including the Protestant revolution, navigation for oceanic voyages, calendar reform, and Renaissance humanism and Neoplatonism.

Copernicus, according to Kuhn, was the immediate inheritor of Aristotelian-Ptolemaic cosmological tradition and, except for the position of the earth, was closer to that tradition than to modern astronomy. For Kuhn, De Revolutionibus precipitated a revolution and was not the revolution itself. Although the problem Copernicus addressed was the same as for his predecessors, that is, planetary motion, his solution was to revise the mathematical model for that motion by making the earth a planet that moves around the sun. Essentially, Copernicus maintained the Aristotelian-Ptolemaic universe but exchanged the sun for the earth, as the universe’s center. Although Copernicus had eliminated major epicycles, he still used minor ones and the accuracy of planetary position was no better than Ptolemy’s. Kuhn concluded that Copernicus did not really solve the problem of planetary motion.

Initially, according to Kuhn, there were only a few supporters of Copernicus’ cosmology. Although the majority of astronomers accepted the mathematical harmonies of De Revolutionibus after its publication in 1543, they rejected or ignored its cosmology. Tycho Brahe, for example, although relying on Copernican harmonies to explain astronomical data, proposed a system in which the earth was still the universe’s center. Essentially, it was a compromise between ancient cosmology and Copernican mathematical astronomy. However, Brahe recorded accurate and precise astronomical observations, which helped to compel others towards Copernicanism—particularly Johannes Kepler, who used its mathematical precision to solve the planetary motion problem. The final player Kuhn considered in the revolution was Galileo, who, Kuhn claimed, provided through telescopic observations not proof of but rather propaganda for Copernicanism.

Although astronomers achieved consensus during the seventeenth century, Copernicanism still faced serious resistance from Christianity. The Copernican revolution was completed with the Newtonian universe, which not only had an impact on astronomy but also on other sciences and even non-sciences. For instance, Newton’s universe changed the nature of God to that of a clockmaker. For Kuhn, Newtonian’s impact on disciplines other than astronomy was an example of its fruitfulness. Scientific progress, concluded Kuhn, is not the linear process, as championed by traditional philosophers of science, in which scientific facts are stockpiled in a warehouse. Rather, it is the repeated destruction and replacement of scientific theories.

The professional reviews of The Copernican Revolution signaled Kuhn’s acceptance into the philosophical and historical communities. His reconstruction of the revolution was considered for the most part scientifically accurate and methodologically appropriate. Reviewers considered integration of the science and the social an advance over other histories that ignored these dimensions of the historical narrative. Although philosophers appreciated the historical dimension of Kuhn’s study, they found its analysis imprecise according to their standards. Overall, both the historical and philosophical communities expressed no major objections to the image of science that animated Kuhn’s narrative.

Kuhn’s reconstruction of the Copernican revolution portrayed a radically different image of science than that of traditional philosophers of science. Justification of scientific knowledge was not simply a logical or objective affair but also included non-logical or subjective factors. According to Kuhn, scientific progress is not a clear-sighted linear process aimed directly at the truth. Rather, there are contingencies that can divert and forestall the progress of science. Moreover, Copernicus’ revolution changed the way astronomers and non-astronomers viewed the world. This change in perceiving the world was the result of new sets of challenges, new techniques, and a new hermeneutics for interpreting data.

Besides differing from traditional philosophers of science, Kuhn’s image of science put him at odds with Whig historians of science. These historians underrated ancient cosmologies by degrading them to myth or religious belief. Such a move was often a rhetorical ploy on the part of the victors to enhance the status of the current scientific theory. Only by showing how Aristotelian-Ptolemaic geocentric astronomy was authentic science could Kuhn argue for the radical transformation (revolution) that Copernican heliocentric astronomy invoked. Kuhn also asserted that Copernicus’ theory was not accepted simply for its predictive ability, since it was not as accurate as the original conceptual scheme, but because of non-empirical factors, such as the simplicity of Copernican’s system in which certain ad hoc modifications for accounting for the orbits of various planets were eliminated.

In 1956, Harvard denied Kuhn tenure because the tenure committee felt his book on the Copernican revolution was too popular in its approach and analysis. A friend of Kuhn knew Steven Pepper, who was chair of the philosophy department at the University of California at Berkeley. Kuhn’s friend told Pepper that Kuhn was looking for an academic position. Pepper’s department was searching for someone to establish a program in the history and philosophy of science. Berkeley eventually offered Kuhn a position in the philosophy department and later asked if he also wanted an appointment in the history department. Kuhn accepted both positions and joined the Berkeley faculty as an assistant professor.

Kuhn found Stanley Cavell in the philosophy department, a soulmate to replace Nash. Kuhn had meet Cavell earlier while they were both fellows at Harvard. Cavell was an ethicist and aesthetician, whom Kuhn found intellectually stimulating. He introduced Kuhn to Wittgenstein’s notion of language games. Besides Cavell, Kuhn developed a professional relationship with Paul Feyerabend, who was also working on the notion of incommensurability.

In 1958, Berkeley promoted Kuhn to associate professor and granted him tenure. Moreover, having completed several historical projects, he was ready to return to the philosophical issues that first attracted him to the history of science. Beginning in the fall of 1958, he spent a year as a fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford, California. What struck Kuhn about the relationships among behavioral and social scientists was their inability to agree on the fundamental problems and practices of their discipline. Although natural scientists do not necessarily have the right answers to their questions, there is an agreement over fundamentals. This difference between natural and social scientists eventually led Kuhn to formulate the paradigm concept.

c. The Last Mile to Structure

Although The Copernican Revolution represented a significant advance in Kuhn’s articulation of a revolutionary theory of science, several issues still needed attention. What was missing from Kuhn’s reconstruction of the Copernican revolution was an understanding of how scientists function on a daily basis, when an impending revolution is not looming. That understanding emerged gradually during the last mile on the road to Structure in terms of three papers written from the mid-fifties to the early sixties.

In the first paper, ‘The function of measurement in modern physical science’, Kuhn challenged the belief that if scientists cannot measure a phenomenon then their knowledge of it is inadequate or not scientific. Part of the reason for Kuhn’s concern over measurement in science was its textbook tradition, which he believed perpetuates a myth about measurement that is misleading. Kuhn compared the textbook presentation of measurement to a machine in which scientists feed laws and theories along with initial conditions into the machine’s hopper at the top, turn a handle on the side representing logical and mathematical operations, and then collect numerical predictions exiting the machine’s chute in the front. Scientists finally compare experimental observations to theoretical predictions. The function of these measurements serves as a test of the theory, which is the confirmation function of measurement.

Kuhn claimed that the above function is not why measurements are reported in textbooks; rather, measurements are reported to give the reader an idea of what the professional community believes is reasonable agreement between theoretical predictions and experimental observations. Reasonable agreement, however, depends upon approximate, not exact, agreement between theory and data and differs from one science to the next. Moreover, external criteria do not exist for determining reasonableness. For Kuhn, the actual function of normal measurement in science is found in its journal articles. That function is neither invention of novel theories nor the confirmation of older ones. Discovery and exploratory measurements in science instead are rare. The reason is that changes in theories, which require discovery or confirmation, occur during revolutions, which are also quite rare. Once a revolution occurs, moreover, the new theory only exhibits potential for ordering and explaining natural phenomena. The function of normal measurement is to tighten reasonable agreement between novel theoretical predictions and experimental observations.

The textbook tradition is also misleading in terms of normal measurement’s effects. It claims that theories must conform to quantitative facts. Such facts are not the given but the expected and the scientist’s task is to obtain them. This obligation to obtain the expected quantitative fact is often the incentive for developing novel technology. Moreover, a well-developed theoretical system is required for meaningful measurement in science. Besides the function of normal or expected measurement, Kuhn also examined the function of extraordinary measurement—which pertain to unexpected results. It is this latter type of measurement that exhibits the discovery and confirmatory functions. When normal scientific practice results consistently in unexpected anomalies, this leads to crisis, and extraordinary measurement often aids to resolve it. Crisis then leads to the invention of new theories. Again, extraordinary measurement plays a critical role in this process. Theory invention in response to quantitative anomalies leads to decisive measurements for judging a novel theory’s adequacy, whereas qualitative anomalies generally lead to ad hoc modifications of theories. Extraordinary measurement allows scientists to choose among competing theories.

Kuhn was moving closer towards a notion of normal science through an analysis of normal measurement, in contrast to extraordinary measurement, in science. His conception of science continued to distance him from traditional philosophers of science. But, the notion of normal measurement was not as robust as he needed. Importantly, Kuhn was changing the agenda for philosophy of science from justification of scientific theories as finished products in textbooks to dynamic process by which theories are tested and assimilated into the professional literature. A robust notion of normal science was the revolutionary concept he needed, to overturn the traditional image of science as an accumulated body of facts.

With the introduction of normal and extraordinary measurement, the step towards the notions of normal and extraordinary science in Kuhn’s revolutionary image of science was imminent. Kuhn worked out those notions in The Essential Tension. He began by addressing the notion that creative thinking in science assumes a particular assumption of science in which science advances through unbridled imagination and divergent thinking—which involves identifying multiple avenues by which to solve a problem and determining which one works best. Kuhn acknowledged that such thinking is responsible for some scientific progress, but he proposed that convergent thinking—which limits itself to well-defined, often logical, steps for solving a problem—is also an important means of progress. While revolutions, which depend on divergent thinking, are an obvious means for scientific progress, Kuhn insisted that few scientists consciously design revolutionary experiments. Rather, most scientists engage in normal research, which represents convergent thinking. But, occasionally scientists may break with the tradition of normal science and replace it with a new tradition. Science, as a profession, is both traditional and iconoclastic, and the tension between them often creates a space in which to practice it.

Next, Kuhn utilized the term paradigm, while discussing the pedagogical advantages of convergent thinking—especially as displayed in science textbooks. Whereas textbooks in other disciplines include the methodological and conceptual conflicts prevalent within the discipline, science textbooks do not. Rather, science education is the transmission of a tradition that guides the activities of practitioners. In science education, students are taught not to evaluate the tradition but to accept it.

Progress within normal research projects represents attempts to bring theory and observation into closer agreement and to extend a theory’s scope to new phenomena. Given the convergent and tradition-bound nature of science education and of scientific practice, how can normal research be a means for the generation of revolutionary knowledge and technology? According to Kuhn, a mature science provides the background that allows practitioners to identify non-trivial problems or anomalies with a paradigm. In other words, without mature science there can be no revolution.

Kuhn continued to develop the notion of normal research and its convergent thinking in ‘The function of dogma in scientific research’. He began with the traditional image of science as an objective and critical enterprise. Although this is the ideal, the reality is that often scientists already know what to expect from their investigations of natural phenomena. If the expected is not forthcoming, then scientists must struggle to find conformity between what they expect and what they observe, which textbooks encode as dogmas. Dogmas are critical for the practice of normal science and for advancement in it because they define the puzzles for the profession and stipulate the criteria for their solution.

Kuhn next expanded the range of paradigms to embrace scientific practice in general, rather than simply as a model for research. Specifically, paradigms include not only a community’s previous scientific achievements but also its theoretical concepts, the experimental techniques and protocols, and even the natural entities. In short, they are the community’s body of beliefs or foundations. Paradigms are also open-ended in terms of solving problems. Moreover, they are exclusive in their nature, in that there is only one paradigm per mature science. Finally, they are not permanent fixtures of the scientific landscape, for eventually paradigms are replaceable. Importantly, for Kuhn, when a paradigm replaces another the two paradigms are radically different.

Having done paradigmatic spadework, Kuhn then discussed the notion of normal scientific research. The process of matching paradigm and nature includes extending and applying the paradigm to expected but also unexpected parts of nature. This does not necessarily mean discovering the unknown as it does explaining the known. Although the dogma paper is only a fragment of the solution to problems associated with the traditional image of science, the complete solution was soon to appear in Structure.

3. The Structure of Scientific Revolutions

In July 1961, Kuhn completed a draft of Structure; and in 1962, it was published as the final monograph in the second volume of Neurath’s International Encyclopedia of Unified Science. Charles Morris was instrumental in its publication and Carnap served as its editor. Structure was not a single publishing event in 1962; rather, it covered the years from 1962 to 1970. After its publication, Kuhn was engrossed for the rest of the sixties addressing criticisms directed to the ideas contained in it, especially the paradigm concept. During this time, he continued to develop and refine his new image of science. The endpoint was a second edition of Structure that appeared in 1970. The text of the revised edition, however, remained essentially unaltered and only a ‘Postscript—1969’ was added in which Kuhn addressed his critics.

What Kuhn proposed in Structure was a new image of science. That image differed radically from the traditional one. The difference hinged on a shift from a logical analysis and an explanation of scientific knowledge as finished product to a historical narration and description of scientific practices by which a community of practitioners produces scientific knowledge. In short, it was a shift from the subject (the product) to the verb (to produce).

According to the traditional image, science is a repository of accumulated facts, discovered by individuals at specific periods in history. One of the central tasks of traditional historians, given this image of science, was to answer questions about who discovered what and when. Even though the task seemed straightforward, many historians found it difficult and doubted whether these were the right kinds of questions to ask concerning science’s historical record. The historiographic revolution in the study of science changed the sorts of questions historians asked by revising the underlying assumptions concerning the approach to reading the historical record. Rather than reading history backwards and imposing current ideas and values on the past, texts are read within their historical context thereby maintaining their integrity. The historiographic revolution also had implications for how to analyze and understand science philosophically. The goal of Structure, declared Kuhn, was to cash out those implications.

The structure of scientific development, according to Kuhn, may be illustrated schematically, as follows: pre-paradigm science → normal science → extraordinary science → new normal science. The step from pre-paradigm science to normal science involves consensus of the community around a single paradigm, where no prior consensus existed. This is the step required for transitioning from immature to mature science. The step from normal science to extraordinary science includes the community’s recognition that the reigning paradigm is unable to account for accumulating anomalies. A crisis ensues, and community practitioners engage in extraordinary science to resolve its anomalies. A scientific revolution occurs with crisis resolution. Once a community selects a new paradigm, it discards the old one and another period of new normal science follows. The revolution or paradigm shift is now complete, and the cycle from normal science to new normal science through revolution is free to occur again.

For Kuhn, the origin of a scientific discipline begins with the identification of a natural phenomenon, which members of the discipline investigate experimentally and attempt to explain theoretically. But, each member of that nascent discipline is at cross-purposes with other members; for each member often represents a school working from different foundations. Scientists, operating under these conditions, share few, if any, theoretical concepts, experimental techniques, or phenomenal entities. Rather, each school is in competition for monetary and social resources and for the allegiance of the professional guild. An outcome of this lack of consensus is that all facts seem equally relevant to the problem(s) at hand and fact gathering itself is often a random activity. There is then a proliferation of facts and hence little progress in solving the problem(s) under these conditions. Kuhn called this state pre-paradigm or immature science, which is non-directed and flexible, providing a community of practitioners little guidance.

To achieve the status of a science, a discipline must reach consensus with respect to a single paradigm. This is realized when, during the competition involved in pre-paradigm science, one school makes a stunning achievement that catches the professional community’s attention. The candidate paradigm elicits the community’s confidence that the problems are solvable with precision and in detail. The community’s confidence in a paradigm to guide research is the basis for the conversion of its members, who now commit to it. After paradigm consensus, Kuhn claimed that scientists are in the position to commence with the practice of normal science. The prerequisite of normal science then includes a commitment to a shared paradigm that defines the rules and standards by which to practice science. Whereas pre-paradigm science is non-directed and flexible, normal or paradigm science is highly directed and rigid. Because of its directedness and rigidity, normal scientists are able to make the progress they do.

The paradigm concept loomed large in Kuhn’s new image of science. He defined the concept in terms of the community’s concrete achievements, such as Newtonian mechanics, which the professional can commonly recognize but cannot fully describe or explain. A paradigm is certainly not just a set of rules or algorithms by which scientists blindly practice their trade. In fact, there is no easy way to abstract a paradigm’s essence or to define its features exhaustively. Moreover, a paradigm defines a family resemblance, à la Wittgenstein, of problems and procedures for solving problems that are part of a single research tradition.

Although scientists rely, at times, on rules to guide research, these rules do not precede paradigms. Importantly, Kuhn was not claiming that rules are unnecessary for guiding research but rather that they are not always sufficient, either pedagogically or professionally. Kuhn compared the paradigm concept to Polanyi’s notion of tacit knowledge, in which knowledge production depends on the investigator’s acquisition of skills that do not reduce to methodological rules and protocols.

As noted above, Newtonian mechanics represents an example of a Kuhnian paradigm. The three laws of motion comprising it provided the scientific community with the resources to investigate natural phenomena in terms of both precision and predictability. In terms of precision, Newtonian mechanics allowed physicists to measure and explain accurately—with clockwork exactitude—the motion not only of celestial but also terrestrial bodies. With respect to prediction, physicists used the Newtonian paradigm to determine the potential movement of heavenly and earthly bodies. Thus, Newtonian mechanics qua paradigm equipped physicists with the ability to explain and manipulate natural phenomena. In sum, it became a way of viewing the world.

According to Kuhn, a paradigm allows scientists to ignore concerns over a discipline’s fundamentals and to concentrate on solving its puzzles—as the Newtonian paradigm permitted physicists to do for several centuries. It not only guides scientists in terms of identifying soluble puzzles, but it also prevents scientists from tackling insoluble ones. Kuhn compared paradigms to maps that guide and direct the community’s investigations. Only when a paradigm guides the community’s activities is scientific advancement as cumulative progress possible.

The activity of practitioners engaged in normal science is paradigm articulation and extension to new areas. Indeed, the Newtonian paradigm was adapted even for medicine. When a new paradigm is established, it solves only a few critical problems that faced the community. But, it does offer the promise for solving many more problems. Much of normal science involves mopping up, in which the community forces nature into a conceptually rigid framework—the paradigm. Rather than being dull and routine, however, such activity, according to Kuhn, is exciting and rewarding and requires practitioners who are creative and resourceful.

Normal scientists are not out to make new discoveries or to invent new theories, outside the paradigm’s aegis. Rather, they are involved in using the paradigm to understand nature precisely and in detail. From the experimental end of this task, normal scientists go to great pains to increase the precision and reliability of their measurements and facts. They are also involved in closing the gap between observations and theoretical predictions, and they attempt to clarify ambiguities left over from the paradigm’s initial adoption. They also strive to extend the scope of the paradigm by including phenomena not heretofore investigated. Much of this activity requires exploratory investigation, in which normal scientists make novel discoveries but anticipated vis-à-vis the paradigm. To solve these experimental puzzles often requires considerable technological ingenuity and innovation on the part of the scientific community. As Kuhn notes, Atwood’s machine—developed almost a century after Newton, is a good illustration of this.

Besides experimental puzzles, there are also the theoretical puzzles of normal science, which obviously mirror the types of experimental puzzles. Normal scientists conduct theoretical analyses to enhance the match between theoretical predictions and experimental observations, especially in terms of increasing the paradigm’s precision and scope. Again, just as experimental ingenuity is required so is theoretical ingenuity to explain natural phenomena successfully.

Normal science, according to Kuhn, is puzzle-solving activity, and its practitioners are puzzle solvers and not paradigm testers. The paradigm’s power over a community of practitioners is that it can transform seemingly insoluble problems into soluble ones through the practitioner’s ingenuity and skill. Besides the assured solution, Kuhn’s paradigm concept also involved rules of the puzzle-solving game not in a narrow sense of algorithms but in a broad sense of viewpoints or preconceptions. Besides these rules of the game, as it were, there are also metaphysical commitments, which inform the community as to the types of natural entities, and methodological commitments, which inform the community as to kinds of laws and explanations. Although rules are often necessary for normal scientific research, they are not always required. Normal science can proceed in the absence of such rules.

Although scientists engaged in normal science do not intentionally attempt to make unexpected discoveries, such discoveries do occur. Paradigms are imperfect and rifts in the match between paradigm and nature are inevitable. For Kuhn, discoveries not only occur in terms of new facts but there is also invention in terms of novel theories. Both discoveries of new facts and invention of novel theories begin with anomalies, which are violations of paradigm expectations during the practice of normal science. Anomalies can lead to unexpected discoveries. For Kuhn, unexpected discoveries involve complex processes that include the intertwining of both new facts and novel theories. Facts and theories go hand-in-hand, for such discoveries cannot be made by simple inspection. Because discoveries depend upon the intertwining of observations and theories, the discovery process takes time for the conceptual integration of the novel with the known. Moreover, that process is complicated by the fact that novelties are often resisted due to prior expectations. Because of allegiance to a paradigm, scientists are loathed to abandon it simply because of an anomaly or even several anomalies. In other words, anomalies are generally not counter-instances that falsify a paradigm.

Just as anomalies are critical for discovery of new facts or phenomena, so they are essential for the invention of novel theories. Although facts and theories are intertwined, the emergence of novel theories is the outcome of a crisis. The crisis is the result of the paradigm’s breakdown or inability to provide solutions to its anomalies. The community then begins to harbor questions about the ability of the paradigm to guide research, which has a profound impact upon it. The chief characteristic of a crisis is the proliferation of theories. As members of a community in crisis attempt to resolve its anomalies, they offer more and varied theories. Interestingly, anomalies that are responsible for the crisis may not necessarily be new since they may have been present all along. This helps to explain why anomalies lead to a period of crisis in the first place. The paradigm promised resolution of them but was unable to fulfill its promise. The overall effect is a return to a situation very similar to pre-paradigm science.

Closure of a crisis occurs in one of three possible ways, according to Kuhn. First, on occasion that the paradigm is sufficiently robust to resolve anomalies and to restore normal science practice. Second, even the most radical methods are unable to revolve the anomalies. Under these circumstances, the community tables them until future investigation and analysis. Third, the crisis is resolved with the replacement of the old paradigm by a new one but only after a period of extraordinary science.

Kuhn stressed that the initial response of a community in crisis is not to abandon its paradigm. Rather, its members make every effort to salvage it through ad hoc modifications until the anomalies can be resolved, either theoretically or experimentally. The reason for this strong allegiance, claimed Kuhn, is that a community must first have an alternative candidate to take the original paradigm’s place. For science, at least normal science, is possible only with a paradigm, and to reject it without a substitute is to reject science itself, which reflects poorly on the community and not on the paradigm. Moreover, a community does not reject a paradigm simply because of a fissure in the paradigm-nature fit. Kuhn’s aim was to reject a naïve Popperian falsificationism in which single counter-instances are sufficient to reject a theory. In fact, he reversed the tables and contended that counter-instances are essential for the practice of vibrant normal science. Although the goal of normal science is not necessarily to generate counter-instances, normal science practice does provide the occasion for their possible occurrence. Normal science, then, serves as an opportunity for scientific revolutions. If there are no counter-instances, reasoned Kuhn, scientific development comes to a halt.

The transition from normal science through crisis to extraordinary science involves two key events. First, the paradigm’s boundaries become blurred when faced with recalcitrant anomalies; and, second, its rules are relaxed leading to proliferation of theories and ultimately to the emergence of a new paradigm. Often relaxing the rules allows practitioners to see exactly where the problem is and how to solve it. This state has tremendous impact upon a community’s practitioners, similar to that during pre-paradigm science. Extraordinary scientists, according to Kuhn, behave erratically—because scientists are trained under a paradigm to be puzzle-solvers, not paradigm-testers. In other words, they are not trained to do extraordinary science and must learn as they go. For Kuhn, this type of behavior is more open to psychological than logical analysis. Moreover, during periods of extraordinary science practitioners may even examine the discipline’s philosophical foundations. To that end, they analyze their assumptions in order to loosen the old paradigm’s grip on the community and to suggest alternative approaches to the generation of a new paradigm.

Although the process of extraordinary science is convoluted and complex, a replacement paradigm may emerge suddenly. Often the source of its inspiration is rooted in the practice of extraordinary science itself, in terms of the interconnections among various anomalies. Finally, whereas normal science is a cumulative process, adding one paradigm achievement to the next, extraordinary science is not; rather, it is like—using Herbert Butterfield’s analogy—grabbing hold of a stick’s other end. That other end of the stick is a scientific revolution.

The transition from extraordinary science to a new normal science represents a scientific revolution. According to Kuhn, a scientific revolution is non-cumulative in which a newer paradigm replaces an older one—either partially or completely. It can come in two sizes: a major revolution such as the shift from geocentric universe to heliocentric universe or a minor revolution such as the discovery of X-rays or oxygen. But whether big or small, all revolutions have the same structure: generation of a crisis through irresolvable anomalies and establishment of a new paradigm that resolves the crisis-producing anomalies.

Because of the extreme positions taken by participants in a revolution, opposing camps often become galvanized in their positions, and communication between them breaks down and discourse fails. The ultimate source for the establishment of a new paradigm during a crisis is community consensus, that is, when enough community members are convinced by persuasion and not simply by empirical evidence or logical analysis. Moreover, to accept the new paradigm, community practitioners must be assured that there is no chance for the old paradigm to solve its anomalies.

Persuasion loomed large in Kuhn’s scientific revolutions because the new paradigm solves the anomalies the old paradigm could not. Thus, the two paradigms are radically different from each other, often with little overlap between them. For Kuhn, a community can only accept the new paradigm if it considers the old one wrong. The radical difference between old and new paradigms, such that the old cannot be derived from the new, is the basis of the incommensurability thesis. In essence, there is no common measure or standard for the two paradigms. This is evident, claimed Kuhn, when looking at the meaning of theoretical terms. Although the terms from an older paradigm can be compared to those of a newer one, the older terms must be transformed with respect to the newer ones. But, there is a serious problem with restating the old paradigm in transformed terms. The older, transformed paradigm may have some utility, for example pedagogically, but a community cannot use it to guide its research. Like a fossil, it reminds the community of its history but it can no longer direct its future.

The establishment of a new paradigm resolves a scientific revolution and issues forth a new period of normal science. With its establishment, Kuhn’s new image of a mature science comes full circle. Only after a period on intense competition among rival paradigms, does the community choose a new paradigm and scientists once again become puzzle-solvers rather than paradigm-testers. The resolution of a scientific revolution is not a straightforward process that depends only upon reason or evidence. Part of the problem is that proponents of competing paradigms cannot agree on the relevant evidence or proof or even on the relevant anomalies that require resolution, since their paradigms are incommensurable.

Another factor that leads to difficulties in resolving scientific revolutions is that communication among members in crisis is only partial. This results from the new paradigm borrowing from the old paradigm theoretical terms and concepts, and laboratory protocols. Although they share borrowed vocabulary and technology, the new paradigm gives new meaning and uses to them. The net result is that members of competing paradigms talk past one another. Moreover, the change in paradigms is not a gradual process in which different parts of the paradigm are changed piecemeal; rather, the change must be as a whole and suddenly. Convincing scientists to make such a wholesale transformation takes time.

How then does one segment of the community convince or persuade another to switch paradigms? For members who worked for decades under the old paradigm, they may never accept the new paradigm. Rather, it is often the younger members who accept the new paradigm through something like a religious conversion. According to Kuhn, faith is the basis for conversion, especially faith in the potential of the new paradigm to solve future puzzles. By invoking the terms conversion and faith, Kuhn was not implying that arguments and reason are unimportant in a paradigm shift. Indeed, the most common reason for accepting a new paradigm is that it solves the anomalies the old paradigm could not. However, Kuhn point was that argument and reason alone are insufficient. Aesthetic or subjective factors also play an important role in a paradigm shift, since the new paradigm solves only a few, but critical, anomalies. These factors weigh heavily in the shift initially by reassuring community members that the new paradigm represents the discipline’s future.

From the resolution of revolutions, Kuhn made several important philosophical points concerning the principles of verification and falsification. As Kuhn acknowledged, philosophers no longer search for absolute verification, since no theory can be tested exhaustively; rather, they calculate the probability of a theory’s verification. According to probabilistic verification, every imaginable theory must be compared with one another vis-à-vis the available data. The problem in terms of Kuhn’s new image of science is that a theory is tested with respect to a given paradigm, and such a restriction precludes access to every imaginable theory. Moreover, Kuhn rejected falsifying instances because no paradigm resolves every problem facing a community. Under these conditions, no paradigm would ever be accepted. For Kuhn, the process of verification and falsification must include imprecision associated with theory-fact fit.

An interesting feature of scientific revolutions, according to Kuhn, is their invisibility. What he meant by this is that in the process of writing textbooks, popular scientific essays, and even philosophy of science, the path to the current paradigm is sanitized to make it appear as if it was in some sense born mature. Disguising a paradigm’s history is an outcome of a belief about scientific knowledge, which considers it as invariable and its accumulation as linear. This disguising serves the winner of the crisis by establishing its authority, especially as a pedagogical aid for indoctrinating students into a community of practitioners. Another important effect of a revolution, related to a paradigm shift, is a shift in the community’s image of science. The change in science’s image should be no surprise, since the prevailing paradigm defines science. Change that paradigm and science itself changes, at least how to practice it. In other words, the shift in science’s image is a result of a change in the community’s standards for what constitutes its puzzles and its puzzles’ solutions. Finally, revolutions transform scientists from practitioners of normal science, who are puzzle-solvers, to practitioners of extraordinary science, who are paradigm-testers. Besides transforming science, revolutions also transform the world that scientists inhabit and investigate.

One of the major impacts of a scientific revolution is a change of the world in which scientists practice their trade. Kuhn’s world-changes thesis, as it has become known, is certainly one of his most radical and controversial ideas, besides the associated incommensurability thesis. The issue is how far ontologically does the change go, or is it simply an epistemological ploy to reinforce the comprehensive effects of scientific revolutions. In other words, does the world really change or simply the worldview, that is, one’s perspective on or perception of the world? For Kuhn, the answer relied not on a logical or even a philosophical but rather a psychological analysis of the change.

Kuhn analyzed the changes in worldview by analogizing it to a gestalt switch, for example, duck-rabbit. Although the gestalt analogy is suggestive, it is limited to only perceptual changes and says little about the role of previous experience in such transformations. Previous experience is important because it influences what a scientist sees when making an observation. Moreover, with a gestalt switch, the person can stand above or outside of it acknowledging with certainty that one sees now a duck or now a rabbit. Such an independent perspective, which eventually is an authoritarian stance, is not available to the community of practitioners; there is no answer sheet, as it were. Because the community’s access to the world is limited by what it can observe, any change in what is observed has important consequences for the nature of what is observed, that is, the change has ontological significance.

Thus, for Kuhn, the change revolution brings about is more than simply seeing or observing a different world; it also involves working in a different world. The perceptual transformation is more than reinterpretation of data. For, data are not stable but they too change during a paradigm shift. Data interpretation is a function of normal science, while data transformation is a function of extraordinary science. That transformation is often a result of intuitions. Moreover, besides a change in data, revolutions change the relationships among data. Although traditional western philosophy has searched for three centuries for stable theory-neutral data or observations to justify theories, that search has been in vain. Sensory experience occurs through a paradigm of some sort, argued Kuhn, even articulations of that experience. Hence, no one can step outside a paradigm to make an observation; it is simply impossible given the limits of human physiology.

Kuhn then took on the nature of scientific progress. For normal science, progress is cumulative in that the solutions to puzzles form a repository of information and knowledge about the world. This progress is the result of the direction a paradigm provides a community of practitioners. Importantly, the progress achieved through normal science, in terms of the information and knowledge, is used to educate the next generation of scientists and to manipulate the world for human welfare. Scientific revolutions change all that. For Kuhn, revolutionary progress is not cumulative but non-cumulative.

What, then, does a community of practitioners gain by going through a revolution or paradigm shift? Has it made any kind of progress in its rejection of a previous paradigm and the fruit that paradigm yielded? Of course, the victors of the revolution are going to claim that progress was made after the revolution. To do otherwise would be to admit that they were wrong. Rather advocates of the new normal science are going to do everything they can to ensure that their winning paradigm is seen as pushing forward a better understanding of the world. The progress achieved through a revolution is two-fold, according to Kuhn. The first is the successful solution of anomalies that a previous paradigm could not solve. The second is the promise to solve additional problems or puzzles that arise from these anomalies.

But has the community gotten closer to the truth, that is, the notion of verisimilitude, by going through a revolution? According to Kuhn the answer is no. For Kuhn, progress in science is not directed activity towards some goal like truth. Rather, scientific progress is evolutionary. Just as natural selection operates during biological evolution in the emergence of a new species, so community selection during a scientific revolution functions similarly in the emergence of a new theory. And, just as species are adapted to their environments, so theories are adapted to the world. Kuhn had no answer to the question why this should be other than the world and the community that investigates it exhibit unique features. What these features are, Kuhn did not know, but he concluded that the new image of science he had proposed would resolve, like a new paradigm after a scientific revolution, these problems. He invited the next generation of philosophers of science to join him in a new philosophy of science incommensurate with its predecessor.

The reaction to Kuhn’s Structure was at first congenial, especially by historians of science, but within a few years it turned critical, particularly by philosophers. Critics charged him with irrationalism and epistemic relativism. Although he felt the reviews of Structure were good, his chief concerns were the tags of irrationalism and relativism—at least a pernicious kind of relativism. Kuhn believed the charges were inaccurate, however, simply because he maintained that science does not progress toward a predetermined goal. But, like evolutionary change, one theory replaces another with a better fit between theory and nature vis-à-vis competitors. Moreover, he believed that use of the Darwinian evolution was the correct framework for discussing science’s progress. But, he felt no one took it seriously.

On 13 July 1965, Kuhn participated in an International Colloquium in the Philosophy of Science, held at Bedford College in London. The colloquium was organized jointly by the British Society for the Philosophy of Science and by the London School of Economics and Political Science. Kuhn delivered the initial paper comparing his and Karl Popper’s conceptions of the growth of scientific knowledge. John Watkins then delivered a paper criticizing Kuhn’s notion of normal science, with Popper chairing the session. Popper also presented a paper criticizing Kuhn, as did several other members of the philosophy of science community, including Stephen Toulmin, L. Pearce Williams, and Margaret Masterman, who identified twenty-one senses of Kuhn’s use of paradigm in Structure. Masterman concluded her paper inviting others to join in clarifying Kuhn’s paradigm concept.

Kuhn himself took up Masterman’s challenge and clarified the paradigm concept in the second edition of Structure, particularly in its ‘Postscript—1969’. To that end, he divided paradigm into disciplinary matrix and exemplars. The former represents the milieu of the professional practice, consisting of symbolic generalizations, models, and values, while the latter represents solutions to concrete problems that a community accepts as paradigmatic. In other words, exemplars serve as templates for solving problems or puzzles facing the scientific community and thereby for advancing the community’s scientific knowledge. For Kuhn, scientific knowledge is not localized simply within theories and rules; rather, it is localized within exemplars. The basis for an exemplar to function in puzzle solving is the scientist’s ability to see the similarity between a previously solved puzzle and a currently unsolved one.

In the early sixties, van Vleck invited Kuhn to direct a project collecting materials on the history of quantum mechanics. In August 1960, Hunter Dupree, Charles Kittel, Kuhn, John Wheeler, and Harry Wolff, met in Berkeley to discuss the project’s organization. Wheeler next met with Richard Shryock and a joint committee of the American Physical Society and the American Philosophical Society on the History of Theoretical Physics in the Twentieth Century was formed to sponsor and develop the project. The project lasted for three years, with the first and last years of the project conducted in Berkeley and the middle year in Europe. The National Science Foundation funded the project.

The project led to a publication, by John Heilbron and Kuhn, on the origins of the Bohr atom. They provided a revisionist narrative of Bohr’s path to the quantized atom, beginning with his 1911 doctoral dissertation and concluding with his 1913 three-part paper on atomic structure. The intrigue of this historical study was that within a six-week period in mid-1912 Bohr went from little interest in models of the atom to producing a quantized model of J.J. Rutherford’s atom and applying that model to several perplexing problems. The authors explored Bohr’s sudden interest in atomic models. They proposed that his interest stemmed from specific problems, which guided Bohr in terms of both his reading and research toward the potential of the atomic structure for solving them. The solutions to those problems resulted from what Heilbron and Kuhn called a 1913 February transformation in Bohr’s research. What initiated the transformation, claimed the authors, was that Bohr had read a few months earlier J.W. Nicholson’s papers on the application of Max Planck’s constant to generate an atomic model. Although Nicholson’s model was incorrect, it led Bohr in the right direction. Then in February 1913, Bohr, in a conversation with H.R. Hansen, obtained the last piece of the puzzle. After the transformation, Bohr completed the atomic model project within the year.

Besides completing a draft of Structure in 1961, Kuhn was made full professor at Berkeley, but only in the history department. Members of philosophy department voted to deny him promotion in their department, a denial that angered and hurt Kuhn tremendously. Princeton University made Kuhn an offer to join its faculty, while he was in Europe. The university had recently inaugurated a history and philosophy of science program. The program’s chair was Charles Gillispie and its staff included John Murdoch, Hilary Putnam, and Carl Hempel. Upon returning to the United States in 1963, Kuhn visited Princeton. He decided to accept the offer and joined its faculty in 1964. He became the program’s director in 1967 and the following year Princeton appointed him the Moses Taylor Pyne Professor of History. As the sixties ended, Structure was becoming increasingly popular, especially among student radicals who believed it liberated them from the tyranny of tradition.

4. The Road after Structure

In 1979, Kuhn moved to M.I.T.’s Department of Linguistics and Philosophy. In 1983, he was appointed the Laurance S. Rockefeller Professor of Philosophy. At M.I.T., he took a linguistic turn in his thinking, reflecting his new environment, which had a major impact on his subsequent work, especially on the incommensurability thesis.

Structure’s success not only established the historiographic revolution in the study of science in either historically or philosophically or what came to be called the discipline of history and philosophy of science, but also supported the rise of science studies in general and specifically the sociology and anthropology of science, particularly the sociology of scientific knowledge. Kuhn rejected both these trajectories often attributed to Structure, for what he called historical philosophy of science. He conducted—as he categorized his work in the Essential Tension—either historical studies on science or their historiographic implications, or either metahistorical studies or their philosophical implications. In other words, his scholarly work was either historical or philosophical.

a. Historical and Historiographic Studies

Kuhn’s final major historical study was on Planck’s black-body radiation theory and the origins of quantum discontinuity. The transition from classical physics—in which particles pass through intermediate energy stages—to quantum physics—in which energy change is discontinuous—is traditionally attributed to Planck’s 1900 and 1901 quantum papers. According to Kuhn, this traditional account was inaccurate and the transition was initiated by Albert Einstein’s and Paul Ehrenfest’s independent 1906 quantum papers. Kuhn’s realization of this inaccuracy was similar to the enlightenment he experienced when struggling to make sense of Aristotle’s notion of mechanical motion. His initial epiphany occurred while reading Planck’s 1895 paper on black-body radiation. Through that experience, he realized that Planck’s 1900 and 1901 quantum papers were not the initiation of a new theory of quantum discontinuity, but rather they represented Planck’s effort to derive the black-body distribution law based on classical statistical mechanics. Kuhn concluded the study with an analysis of Planck’s second black-body theory, first published in 1911, in which Planck used the notion of discontinuity to derive the second theory. Rather than the traditional position, which claimed the second theory represents a regression on Planck’s part to classical physics, Kuhn argued that it represents the first time Planck incorporated into his theoretical work a theory in which he was not completely confident.

In the black-body radiation and quantum discontinuity historical study, Kuhn did not use paradigm, normal science, anomaly, crisis, or incommensurability, which he championed in Structure. Critics, especially within the history and philosophy of science discipline, were disappointed. Kuhn bemoaned the book’s reception, even by its supporters. However, he later explored the historiographic and philosophical issues raised in Black-Body Theory with respect to Structure. The historiographic issues that the former book addressed were the same raised in the 1962 monograph. Specifically, he claimed that current historiography should attempt to understand previous scientific texts in terms of their contemporary context and not in terms of modern science. Kuhn’s concern was more than historical accuracy; rather, he was interested in recapturing the thought processes that lead to a change in theory. Although Structure was Kuhn’s articulation of this process for scientific change, the terminology in the monograph did not represent a straightjacket for narrating history. For Kuhn, the terminology and vocabulary, like paradigm and normal science, used in Structure were not products, such as metaphysical categories, to which a historical narrative must conform; rather, they had a different metaphysical function—as presuppositions towards an historical narrative as process. In other words, Structure’s terminology and vocabulary were tools by which to reconstruct a scientific historical narrative and not a template for articulating it.

The purpose of history of science, according to Kuhn, was not just getting the facts straight but providing philosophers of science with an accurate image of science to practice their trade. Kuhn fervently believed that the new historiography of science would prevent philosophers from engaging in the excesses and distortions prevalent within traditional philosophy of science. He envisioned history of science informing philosophy of science as historical philosophy of science rather than history and philosophy of science, since the relationship was asymmetrical.

Prior to 1950, history of science was a discipline practiced mostly by eminent scientists, who generally wrote heroic biographies or sweeping overviews of a discipline often for pedagogical purposes. Within the past generation, historians of science, such as Alexander Koyré, Anneliese Maier, and E.J. Dijsterhuis, developed an approach to the history of science that was simply more than chronicling science’s theoretical and technical achievements. An important factor in that development was the recognition of institutional and sociological factors in the practice of science. A consequence of this historiographic revolution was the distinction between internal and external histories of science. Internal history of science is concerned with the development of the theories and methods employed by scientists. In other words, it studies the history of events, people, and ideas internal to scientific advancement. The historian as internalist attempts to climb inside the mind of scientists as they push forward the boundaries of their discipline. External history of science concentrates on the social and cultural factors that impinge on the practice of science.

For Kuhn, the distinction between internal and external histories of science mapped onto his pattern of scientific development. External or cultural and social factors are important during a scientific discipline’s initial establishment; however, once established, those factors no longer have a major impact on a community’s practice or its generation of scientific knowledge. They can have a minor impact on a mature science’s practice, such as the timing of technological innovation. Importantly for Kuhn, internal and external approaches to the history of science are not necessarily mutually exclusive but complementary.

b. Metahistorical Studies

As mentioned already, Kuhn considered himself a practitioner of both the history of science and the philosophy of science and not the history and philosophy of science, for a very practical reason. Crassly put, the goal for history is the particular while for philosophy the universal. Kuhn compared the differences between the two disciplines to a duck-rabbit Gestalt switch. In other words, the two disciplines are so fundamentally different in terms of their goals, that the resulting images of science are incommensurable. Moreover, to see the other discipline’s image requires a conversion. For Kuhn, then, the history of science and the philosophy of science cannot be practiced at the same time but only alternatively, and then with difficulty.

How then can the history of science be of use to philosophers of science? The answer for Kuhn was by providing an accurate image of science. Rejecting the covering law model for historical explanation because it reduces historians to mere social scientists, Kuhn advocated an image based on ordering of historical facts into a narrative analogous to the one he proposed for puzzle solving under the aegis of a paradigm in the physical sciences. Historians of science, as they narrate change in science, provide an image of science that reflects the process by which scientific information develops, rather than the image provided by traditional philosophers of science in which scientific knowledge is simply a product of logical verification or falsification. Kuhn insisted that the history of science and the philosophy of science remain distinct disciplines, so that historians of science can provide an image of science to correct the distortion produced by traditional philosophers of science.

According to Kuhn, the social history of science also distorts the image of science. For social historians, scientists construct rather than discover scientific knowledge. Although Kuhn was sympathetic to this type of history, he believed it created a gap between older constructions and the ones replacing them, which he challenged historians of science to fill. Besides social historians of science, Kuhn also accused sociologists of science for distorting the image of science. Although Kuhn acknowledged that factors such as interests, power, authority, among others, are important in the production of scientific knowledge, the predominant use of them by sociologists eclipses other factors such as nature itself. The key to rectifying the distortion introduced by sociologists is to shift from a rationality of belief, that is, the reasons scientists hold specific beliefs, to a rationality of change in beliefs, that is, the reasons scientists change their beliefs. For Kuhn, a historical philosophy of science was the means for correcting these distortions of the scientific image.

Kuhn’s historical philosophy of science focused on the metahistorical issues derived from historical research, particularly scientific development and the related issues of theory choice and incommensurability. Importantly for Kuhn, both theory choice and incommensurability are intimately linked to one another. The former cannot be reduced to an algorithm of objective rules but requires subjective values because of the latter.

Kuhn explored scientific development using three different approaches. The first was in terms of problem versus puzzle solving. According to Kuhn, problems have no ready solution; and, problem solving is often generally pragmatic and is the hallmark of an underdeveloped or immature science. Puzzles, on the other hand, occupy the attention of scientists involved in a developed or mature science. Although they have guaranteed solutions, the methods for solving puzzles are not assured. Scientists, who solve them, demonstrate their ingenuity and are rewarded by the community.

With this distinction in mind, Kuhn envisioned scientific development as the transition of a scientific discipline from an underdeveloped problem-solving state to a developed puzzle-solving one. The question then arises as to how this occurs. The answer that many took from Structure was, adopt a paradigm. However, Kuhn found this answer to be incorrect in that paradigms are not unique only to the sciences. But does articulating the question in terms of puzzle-solving help? Kuhn’s answer was pragmatic, that is, keep trying different solutions until one works. In other words, philosophers of science had no exemplars by which to solve their problems.

Kuhn’s second approach to scientific development was in terms of the growth of knowledge. He proposed an alternative view to the traditional one that scientific knowledge grows by a piecemeal accumulation of facts. To shed light on the alternative view, Kuhn offered a different reconstruction of science. The central ideas of a science cohere with one another, forming a set of the central ideas or core of a particular science. Besides the core, a periphery exists, which represents an area where scientists can investigate problems associated with a research tradition without changing core ideas.

Kuhn then drew parallels between the current reconstruction of science and the earlier one in Structure. Obviously, the transition in cores from one research tradition to another is a scientific revolution. Moreover, the core represents a paradigm that defines a particular research tradition. Finally, the periphery is identified with normal science. The core then provides the means by which to practice science, and to change the core requires significant retooling that practitioners naturally resist.

Is this change in the core a growth of knowledge? To answer the question, Kuhn examined the standard account of knowledge as justified true belief. What he found problematic with the account is the amount or nature of the evidence needed to justify a belief. And this, of course, raises the issue of truth for which he had no ready solution. Ultimately, Kuhn equivocated on the question of the growth of knowledge.

Kuhn’s final approach to scientific development was through the analysis of three scientific revolutions: the shift from Aristotelian to Newtonian physics, Volta’s discovery of the electric cell, and Planck’s black-body radiation research and quantum discontinuity. From these examples, Kuhn derived three characteristics of scientific revolutions. The first was holistic in that scientific revolutions are all-or-none events. The second was the way referents change after revolutions, especially in terms of taxonomic categories. According to Kuhn, revolutions redistribute objects among these categories. The final characteristic of scientific revolutions was a change in a discipline’s analogy, metaphor, or model, which represents the connection between taxonomic categories and the world’s structure.

According to traditional philosophers of science, the objective features of a good scientific theory include accuracy, consistency, scope, simplicity, and fecundity. However, these features, when used individually as criteria for theory choice, argued Kuhn, are imprecise and often conflict with one another. Although necessary for theory choice, they are insufficient and must include the characteristics of the scientists making the choices. These characteristics involve personal experiences or biography and personality or psychological traits. In other words, not only does theory choice rely on a theory’s objective features but also on individual scientists’ subjective characteristics.

Why have traditional philosophers of science ignored or neglected subjective factors in theory choice? Part of the answer is that they confined the subjective to the context of discovery, while restricting the objective to the context of justification. Kuhn insisted that this distinction does not fit with observations of scientific practice. It is artificial, reflecting science pedagogy. But, actual scientific practice reveals that textbook presentations of theory choice are stylized, to convince students who rely on the authority of their instructors. What else can students do? Textbook science discloses only the product of science, not its process. For Kuhn, since subjective factors are present at the discovery phase of science, they should also be present at the justification phase.

According to Kuhn, objective criteria function as values, which do not dictate theory choice but rather influence it. Values help to explain scientists’ behavior, which for the traditional philosopher of science may at times appear irrational. Most importantly, values account for disagreement over theories and help to distribute risk during debates over theories. Kuhn’s position had important consequences for the philosophy of science. He maintained that critics misinterpreted his position on theory choice as subjective. For them, the term denoted a matter of taste that is not rationally discussable. But, his use of the term did involve the discussable with respect to standards. Moreover, Kuhn denied that facts are theory independent and that there is strictly a rational choice to be made. Rather, he contended scientists do not choose a theory based on objective criteria alone but are converted based on subjective values.

Finally, Kuhn discussed theory choice with respect to the incommensurability thesis. The question he entertained was what type of communication is possible among community members holding competing theories. The answer, according to Kuhn, is that communication is partial. The answer raised a second, and more important, question for Kuhn and his critics. Is good reason vis-à-vis empirical evidence available to justify theory choice, given such partial communication? The answer would be straightforward if communication was complete, but it is not. For Kuhn, this situation meant that ultimately reasonable evaluation of the empirical evidence is not compelling for theory choice and, of course, raised the charge of irrationality, which he denied.

Kuhn identified two common misconceptions of his version of the incommensurability thesis. The first was that since two incommensurable theories cannot be stated in a common language, then they be cannot compared to one another in order to choose between them. The second was that since an older theory cannot be translated into modern expression, it cannot be articulated meaningfully.

Kuhn addressed the first misconception by distinguishing between incommensurability as no common measure and as no common language. He defined the incommensurability thesis in terms of the latter rather than the former. Most theoretical terms are homophonic and can have the same meaning in two competing theories. However, only a handful of terms are incommensurable or untranslatable. Kuhn considered this a modest version of the incommensurability thesis, calling it local incommensurability, and claimed that it was his originally intention. Although there may be no common language to compare terms that change their meaning during a scientific revolution, there is a partially common language composed of the invariant terms that do permit some semblance of comparison. Thus, Kuhn argued, the first criticism fails; because, and this was his main point, an incommensurate residue remains even with a partially common language.

As for the second misconception, Kuhn claimed that critics conflate the difference between translation and interpretation. The conflation is understandable since translation often involves interpretation. Translation for Kuhn is the process by which words or phrases of one language substitute for another. Interpretation, however, involves attempts to make sense of a statement or to make it intelligible. Incommensurability, then, does not mean that a theoretical term cannot be interpreted, that is, cannot be made intelligible; rather, it means that the term cannot be translated, that is, there is no equivalent for the term in the competing theoretical language. In other words, in order for the theoretical term to have meaning the scientist must go native in its use.

Kuhn introduced the notion of the lexicon and its attendant taxonomy to capture both a term’s reference and intention or sense. In the lexicon, there are referring terms that are interrelated to other referring terms, that is, the holistic principle. The lexicon’s structure of interrelated terms resembles the world’s structure in terms of its taxonomic categories. A particular scientific community uses its lexicon to describe and explain the world in terms of this taxonomy. And, members of a community or of different communities must share the same lexicon if they are to communicate fully with one another. Moreover, claimed Kuhn, if full translation is to be achieved the two languages must share a similar structure with respect to their respective lexicons. Incommensurability, then, reflects lexicons that have different taxonomic structures by which the world is carved up and articulated.

Kuhn also addressed a problem that involves communication among communities who hold incommensurable theories, or who occupy positions across a historical divide. Kuhn noted that although lexicons can change dramatically, this does not deter members from reconstructing their past in the current lexicon’s vocabulary. Such reconstruction obviously plays an important function in the community. But the issue is that, given the incommensurable nature of theories, assessments of true and false or right and wrong are unwarranted, for which critics charged Kuhn with a relativist position—a position he was less inclined to deny.

The charge stemmed from the fact that Kuhn advocated no privileged position from which to evaluate a theory. Rather, evaluations must be made within the context of a particular lexicon. And thus, evaluations are relative to the relevant lexicon. But, Kuhn found the charge of relativism trivial. He acknowledged that his position on the relativity of truth and objectivity, with respect to the community’s lexicon, left him no option but to take literally world changes associated with lexical changes. But, is this an idealist position? Kuhn admitted that it appears to be, but he claimed that it is an idealism like none other. On the one hand, the world is composed of the community’s lexicon, but one the other hand, preconceived ideas cannot mold it.

c. Evolutionary Philosophy of Science

From the mid-1980s to early-1990s, Kuhn transitioned from historical philosophy of science and the paradigm concept to an evolutionary philosophy of science and the lexicon notion. To that end, he identified an alternative role for the incommensurability thesis with respect to segregating or isolating lexicons and their associated worlds from one another. Incommensurability now functioned for Kuhn as a mechanism to isolate a community’s lexicon from another’s and as a means to underpin a notion of scientific progress as the proliferation of scientific specialties. In other words, as the taxonomical structure of the two lexicons become isolated and thereby incommensurable with one another, according to Kuhn, a new specialty and its lexicon split off from the old or parent specialty and its lexicon. This process accounts for a notion of scientific progress as an increase in the number of scientific specialties after a revolution.

Scientific progress, then, is akin to biological speciation, argued Kuhn, with incommensurability serving as the isolation mechanism. The result is a tree-like structure with increased specialization at the tips of the branches. Finally, Kuhn’s evolutionary philosophy of science is non-teleological in the sense that science progresses not towards an ultimate truth about the world but simply away from a lexicon that cannot be used to solve its anomalies to one that can. However, he still articulated incommensurability in terms of no common language, with its attendant problems involving the notion of meaning, and did not transform it fully with respect to an evolutionary philosophy of science.

Kuhn was working out an evolutionary philosophy of science in a proposed book, Words and Worlds: An Evolutionary View of Scientific Development. He divided it into three parts, with three chapters in each. In the first part, Kuhn framed the problem associated with the incommensurability thesis and addressed the difficulties accessing past scientific achievements. In the first chapter, he presented an evolutionary view of scientific development. Without an Archimedean platform to guide theory assessment, Kuhn proposed a comparative method for assessing theoretical changes. The method forbids assessment of theories in isolation and methodological solecism. In the next chapter, he discussed the problems associated with examining past historical studies in science. Based on several historical cases, he claimed that anomalies in older scientific texts could be understood only through an interpretative process involving an ethnographic or a hermeneutical reading. He had now laid the groundwork for examining the incommensurability thesis. In the third chapter, Kuhn discussed the changes of word-meanings as changes in a taxonomy embedded in a lexicon—an apparatus of a language’s referring terms. The result of these changes was an untranslatable gap between two incommensurable theories. Finally, the lexical terms referring to objects change as the number of scientific specialties proliferate.

In the book’s second part, Kuhn continued to explore the nature of a community’s lexicon, which he explicated in terms of taxonomic categories. These categories are grouped as contrast sets and no overlap of categories exists within the same contrast set, which Kuhn called the no-overlap principle. The principle prohibits the reference of terms to objects unless related to one another as species to genus. Moreover, the properties of the categories are reflected in the properties of their names. A term’s meaning then is a function of its taxonomic category. And, this restriction is the origin of untranslatability. In the first chapter of this part, Kuhn discussed the nature of substances in terms of sortal predicates. This move allowed Kuhn to introduce plasticity into the lexicon’s usage. Moreover, the differentiating set is not strictly conventional but relies on the world to which the different sets connect. In the next chapter, Kuhn extended the lexicon notion to artifacts, abstractions, and theoretical entities.

In the final chapter of the second part, Kuhn specified the means by which community members acquire a lexicon. First, they must already possess a vocabulary about physical entities and forces. Next, definitions play little, if any, role in learning new terms; rather, those terms are acquired through ostensive examples, especially through problem solving and laboratory demonstrations. Third, a single example is inadequate to learn the meaning of a term; rather, multiple examples are required. Next, acquisition of a new term within a statement also requires acquisition of other new terms within that statement. And lastly, students can acquire the terms of a lexicon through different pedagogical routes.

In the book’s concluding part, Kuhn discussed what occurs during a change in the lexicon and the implications for scientific development. In chapter seven, he examined the means by which lexicons change and the repercussions such change has for communication among communities with different lexicons. Moreover, he explored the role of arguments in lexical change. In the subsequent chapter, Kuhn identified the type of progress achieved with changes in lexicons. He maintained that progress is not the type that aims at a specific goal but rather is instrumental. In the final chapter, he broached the issues of relativism and realism not in traditional terms of truth and objectivity but rather with respect to the capability of making a statement. Statements from incommensurable theories that cannot be translated are ultimately ineffable. They can be neither true nor false but their capability of being stated is relative to the community’s history.

In sum, the book’s aim was certainly to address the philosophical issues left over from Structure, but more importantly, it was to resolve the problems generated by a historical philosophy of science. Although others were also responsible for its creation, Kuhn assumed responsibility for resolving the problems; and the sine qua non for resolving them was the incommensurability thesis. For Kuhn, the thesis was required more than ever to defend rationality from the post-modern development of the strong program.

5. Conclusion

In May 1990, a conference—or as Hempel called it, a Kuhnfest—was held in Kuhn’s honor at MIT, sponsored by the Sloan Foundation and organized by Paul Horwich and Judith Thomson. The conference speakers included Jed Buchwald, Nancy Cartwright, John Earman, Michael Friedman, Ian Hacking, John Heilbron, Ernan McMullin, N.M Swerdlow, and Norton Wise. The papers reflected Kuhn’s impact on the history and the philosophy of science. Hempel made a special appearance on the last day, followed by Kuhn’s remarks on the conference papers. As he approached the podium after Hempel’s remarks, before a standing-room-only audience, Kuhn was visibly moved by the outpouring of professional appreciation for his contributions, to a discipline that he cherished and from its members whom he truly respected.

Kuhn retired from teaching in 1991 and became an emeritus professor at MIT. During Kuhn’s career, he received numerous awards and accolades. He was the recipient of honorary degrees from around a dozen academic institutions, such as University of Chicago, Columbia University, University of Padua, and University of Notre Dame. He was elected a member of the National Academy of Science—the most prestigious society for U.S. scientists—and was an honorary life member of the New York Academy of Science and a corresponding fellow of the British Academy. He was president of the History of Science Society from 1968 to 1970 and the society awarded him its highest honor, the Sarton Medal, in 1982. Kuhn was also the recipient in 1977 of the Howard T. Behrman Award for distinguished achievement in the humanities and in 1983 of the celebrated John Desmond Bernal award. Kuhn died on 17 June 1996 in Cambridge, Massachusetts, after suffering for two years from cancer of the throat and bronchial tubes.

6. References and Further Reading

a. Kuhn’s Work

  • a Kuhn’s work
  • Kuhn Papers, MIT MC 240, Institute Archives and Special Collections, MIT Libraries, Cambridge, MA.
  • Kuhn, T. S. (1957) The Copernican Revolution: Planetary Astronomy in the Development of Western Thought. Cambridge, MA: Harvard University Press.
  • Kuhn, T. S. (1962) The Structure of Scientific Revolutions. Chicago, IL: University of Chicago Press.
  • Kuhn, T. S. (1963) ‘The function of dogma in scientific research’, in A.C. Crombie, ed. Scientific Change: Historical Studies in the Intellectual, Social and Technical Conditions for Scientific Discovery and Technical Invention, From Antiquity to the Present. New York: Basic Books, pp. 347-69.
  • Kuhn, T. S., Heilbron, J. L., Forman, P. and Allen, L. (1967) Sources for History of Quantum Physics: An Inventory and Report. Philadelphia, PA: American Philosophical Society.
  • Heilbron, J. L., and Kuhn, T. S. (1969) ‘The genesis of the Bohr atom’. Historical Studies in the Physical Sciences, 1, 211-90.
  • Kuhn, T. S. (1970) The Structure of Scientific Revolutions (2nd edition). Chicago, IL: University of Chicago Press.
  • Kuhn, T. S. (1977) The Essential Tension: Selected Studies in Science Tradition and Change. Chicago: University of Chicago Press.
  • Kuhn, T. S. (1987) Black-Body Theory and the Quantum Discontinuity, 1894-1912 (revised edition). Chicago: University of Chicago Press.
  • Kuhn, T. S. (1990) ‘Dubbing and redubbing: the vulnerability of rigid designation’, in C.W. Savage, ed. Scientific Theories. Minneapolis, MN: University of Minnesota Press, pp. 298-318.
  • Kuhn, T. S. (2000) The Road since Structure: Philosophical Essays, 1970-1993, with an Autobiographical Interview. Chicago: University of Chicago Press.
    • Contains a comprehensive interview with Kuhn covering his life and work.

b. Secondary Sources

  • Andersen, H. (2001) On Kuhn. Belmont, CA: Wadsworth Publishing.
    • A general introduction to Kuhn and his philosophy.
  • Andersen, H., Barker, P. and Chen, X. (2006) The Cognitive Structure of Scientific Revolutions. New York: Cambridge University Press.
  • Barnes, B. (1982) T.S. Kuhn and Social Science. London: Macmillan Press.
    • Discusses the impact of Kuhn’s philosophy for the sociology of science.
  • Bernardoni, J. (2009) Knowing Nature without Mirrors: Thomas Kuhn’s Antirepresentationalist Objectivity. Saarbrücken, DE: VDM Verlag Dr. Müller.
  • Bird, A. (2000) Thomas Kuhn. Princeton, NJ: Princeton University Press.
    • A critical introduction to Kuhn’s philosophy of science.
  • Bird, A. (2012) ‘The Structure of Scientific Revolutions and its significance: an essay review of the fiftieth anniversary edition’. British Journal for the Philosophy of Science, 63, 859-83.
  • Buchwald, J. Z. and Smith, G. E. (1997) ‘Thomas S. Kuhn, 1922-1996’. Philosophy of Science, 64, 361-76.
  • D’Agostino, F. (2010) Naturalizing Epistemology: Thomas Kuhn and the Essential Tension. New York: Palgrave Macmillan.
  • Davidson, K. (2006) The Death of Truth: Thomas S. Kuhn and the Evolution of Ideas. New York: Oxford University Press.
  • Favretti, R. R., Sandri, G. and Scazzieri, R., eds. (1999) Incommensurability and Translation: Kuhnian Perspectives on Scientific Communication and Theory Change. Northampton, MA: Edward Elgar.
  • Fuller, S. (2000) Thomas Kuhn: A Philosophical History of Our Times. Chicago, IL: University of Chicago Press.
    • A revisionist account of Kuhn as a foot soldier in Conant’s agenda to educate the public about science.
  • Fuller, S. (2004) Kuhn vs. Popper: The Struggle for the Soul of Science. New York: Columbia University Press.
  • Gattei, S. (2008) Thomas Kuhn’s ‘Linguistic Turn’ and the Legacy of Logical Empiricism: Incommensurability, Rationality and the Search for Truth. Burlington, VT: Ashgate.
  • Gutting, G., ed. (1980) Paradigms and Revolutions: Appraisals and Applications of Thomas Kuhn’s Philosophy of Science. Notre Dame, IN: University of Notre Dame Press.
    • A collection of articles addressing Kuhn’s philosophy of science.
  • Heilbron, J. L. (1998) ‘Thomas Samuel Kuhn’. Isis, 89, 505-15.
  • Horgan, J. (1991) ‘Reluctant revolutionary’. Scientific American, 264, 40-9.
    • Is based on an interview with Kuhn about his philosophy.
  • Hufbauer, K. (2012) ‘From student of physics to historian of science: TS Kuhn’s education and early career, 1940–1958’. Physics in Perspective, 14, 421-70.
    • A detailed reconstruction of Kuhn’s education and early career at Harvard.
  • Horwich, P., ed. (1993) World Changes: Thomas Kuhn and the Nature of Science. Cambridge, MA: MIT Press.
    • The published papers from the 1990 Kuhnfest.
  • Hoyningen-Huene, P. (1993) Reconstructing Scientific Revolutions: Thomas S. Kuhn’s Philosophy of Science. Chicago, IL: University of Chicago Press.
  • Hoyningen-Huene, P. and Sankey, H., eds. (2001) Incommensurability and Related Matters. Boston, MA: Kluwer.
  • Hung, E. H. -C. (2006) Beyond Kuhn: Scientific Explanation, Theory Structure, Incommensurability, and Physical Necessity. Burlington, VT: Ashgate.
  • Kindi, V. and Arabatzis T., eds. (2012) Kuhn’s The Structure of Scientific Revolutions Revisited. New York: Routledge.
    • A collection of essays examining the impact of Structure on contemporary philosophy of science.
  • Kuukkanen, J. M. (2008) Meaning Changes: A Study of Thomas Kuhn’s Philosophy. Saarbrücken, DE: VDM Verlag Dr. Müller.
  • Lakatos, I. and Musgrave, A., eds. (1970) Criticism and the Growth of Knowledge. Cambridge, U.K.: Cambridge University Press.
    • The published papers from the 1965 London colloquium.
  • Marcum, J.A. (2015) Thomas Kuhn’s Revolutions: A Historical and an Evolutionary Philosophy of Science? London: Bloomsbury.
  • Nickles, T., ed. (2003) Thomas Kuhn. Cambridge, UK: Cambridge University Press.
  • Onkware, K. (2010) Thomas Kuhn and Scientific Progression: Investigation on Kuhn’s Account of How Science Progresses. Staarbrücken: Lambert Academic Publishing.
  • Preston, J. M. (2008) Kuhn’s The Structure of Scientific Revolutions: A Reader’s Guide. London: Continuum.
  • Ruse, M. (1999) The Darwinian Revolution: Science Red in Tooth and Claw (2nd edition). Chicago, IL: University of Chicago Press
  • Sankey, H. (1994) The Incommensurability Thesis. London: Ashgate.
  • Sardar, Z. (2000) Thomas Kuhn and the Science Wars. New York: Totem Books.
  • Sharrock, W., and Read, R. (2002) Kuhn: Philosopher of Scientific Revolution. Cambridge, UK: Polity.
  • Sigurdsson, S. (1990) ‘The nature of scientific knowledge: an interview with Thomas Kuhn’. Harvard Science Review, Winter issue, 18-25.
  • Suppe, F., ed. (1977) The Structure of Scientific Theories (2nd edition). Urbana, IL: University of Illinois Press.
  • Swerdlow, N.M. (2013) ‘Thomas S. Kuhn, 1922-1996’. Biographical Memoir, National Academy of Sciences USA. http://www.nasonline.org/publications/biographical-memoirs/memoir-pdfs/kuhn-thomas.pdf.
  • Torres, J. M., ed. (2010) On Kuhn’s Philosophy and Its Legacy. Faculdade de Ciêcias da Universidade de Lisboa.
  • von Dietze, E. (2001) Paradigms Explained: Rethinking Thomas Kuhn’s Philosophy of Science. Westport, CT: Praeger.
  • Wade, N. (1977) ‘Thomas S. Kuhn: revolutionary theorist of science’. Science, 197, 143-5.
  • Wang, X. (2007) Incommensurability and Cross-Language Communication. Burlington, VT: Ashgate.
  • Wray, K. B. (2011) Kuhn’s Evolutionary Social Epistemology. New York: Cambridge University Press.

 

Author Information

James A. Marcum
Email: James_Marcum@baylor.edu
Baylor University
U. S. A.

Morality and Cognitive Science

What do we know about how people make moral judgments? And what should moral philosophers do with this knowledge? This article addresses the cognitive science of moral judgment. It reviews important empirical findings and discusses how philosophers have reacted to them.

Several trends have dominated the cognitive science of morality in the early 21st century. One is a move away from strict opposition between biological and cultural explanations of morality’s origin, toward a hybrid account in which culture greatly modifies an underlying common biological core. Another is the fading of strictly rationalist accounts in favor of those that recognize an important role for unconscious or heuristic judgments. Along with this has come expanded interest in the psychology of reasoning errors within the moral domains. Another trend is the recognition that moral judgment interacts in complex ways with judgment in other domains; rather than being caused by judgments about intention or free will, moral judgment may partly influence them. Finally, new technology and neuroscientific techniques have led to novel discoveries about the functional organization of the moral brain and the roles that neurotransmitters play in moral judgment.

Philosophers have responded to these developments in a variety of ways. Some deny that the cognitive science of moral judgment has any relevance to philosophical reflection on how we ought to live our lives, or on what is morally right to do. One argument to this end follows the traditional is/ought distinction and insists that we cannot generate a moral ought from any psychological is. Another argument insists that the study of morality is autonomous from scientific inquiry, because moral deliberation is essentially first-personal and not subject to any third-personal empirical correction.

Other philosophers argue that the cognitive science of moral judgment may have significant revisionary consequences for our best moral theories. Some make an epistemic argument: if moral judgment aims at discovering moral truth, then psychological findings can expose when our judgments are unreliable, like faulty scientific instruments. Other philosophers focus on non-epistemic factors, such as the need for internal consistency within moral judgment, the importance of conscious reflection, or the centrality of intersubjective justification. Certain cognitive scientific findings might require a new approach to these features of morality.

The first half of this article (sections 1 to 4) surveys the cognitive science literature, describing key experimental findings and psychological theories in the moral domain. The second half (sections 5 to 10) discusses how philosophers have reacted to these findings, discussing different ways philosophers have sought to immunize moral inquiry from empirical revision, or enthusiastically taken up psychological tools to make new moral arguments.

Note that the focus of this article is on moral judgment. See the article “Moral Character” for discussion of the relationship between cognitive science and moral character.

Table of Contents

  1. Biological and Cultural Origins of Moral Judgment
  2. The Psychology of Moral Reasoning
  3. Interaction between Moral Judgment and Other Cognitive Domains
  4. The Neuroanatomy of Moral Judgment
  5. What Do Moral Philosophers Think of Cognitive Science?
  6. Moral Cognition and the Is/Ought Distinction
    1. Semantic Is/Ought
    2. Non-semantic Is/Ought
  7. The Autonomy of Morality
  8. Moral Cognition and Moral Epistemology
  9. Non-epistemic Approaches
    1. Consistency in Moral Reasoning
    2. Rational Agency
    3. Intersubjective Justification
  10. Objections and Alternatives
    1. Explanation and Justification
    2. The Expertise Defense
    3. The Regress Challenge
    4. Positive Alternatives
  11. References and Further Reading

1. Biological and Cultural Origins of Moral Judgment

One key empirical question is this: are moral judgments rooted in innate factors or are they fully explained by acquired cultural traits? During the 20th century, scientists tended to adopt extreme positions on this question. The psychologist B. F. Skinner (1971) saw moral rules as socially conditioned patterns of behavior; given the right reinforcement, people could be led to judge virtually anything morally right or wrong. The biologist E. O. Wilson (1975), in contrast, argued that nearly all of human morality could be understand via the application of evolutionary biology. Around the early 21st century, however, most researchers on moral judgment began to favor a hybrid model, allowing roles for both biological and cultural factors.

There is evidence that at least the precursors of moral judgment are present in humans at birth, suggesting an evolutionary component. In a widely cited study, Kiley Hamlin and colleagues examined the social preferences of pre-verbal infants (Hamlin, Wynn, and Bloom 2007). The babies, seated on their parents’ laps, watched a puppet show in which a character trying to climb a hill was helped up by one puppet, but pushed back down by another puppet. Offered the opportunity to reach out and grasp one of the two puppets, babies showed a preference for the helping puppet over the hindering puppet. This sort of preference is not yet full-fledged moral judgment, but it is perhaps the best we can do in assessing the social responses of humans before the onset of language, and it suggests that however human morality comes about, it builds upon innate preferences for pro-social behavior.

A further piece of evidence comes from the work of theorists Leda Cosmides and John Tooby (1992), who argue that the minds of human adults display an evolutionary specialization for moral judgment. The Wason Selection Task is an extremely well-established paradigm in the psychology of reasoning, which shows that most people make persistent and predictable mistakes in evaluating abstract inferences. A series of studies by Cosmides, Tooby, and their colleagues shows that people do much better on a form of this task when it is restricted to violations of social norms. So, for instance, rather than being asked to evaluate an abstract rule linking numbers and colors, people were asked to evaluate a rule prohibiting those below a certain age from consuming alcohol. Participants in these studies made the normal mistakes when looking for violations of abstract rules, but made fewer mistakes in detecting violations of social rules. According to Cosmides and Tooby, these results suggest that moral judgment evolved as a domain-specific capacity, rather than an as application of domain-general reasoning. If this is right, then there must be at least an innate core to moral judgments.

Perhaps the most influential hybrid model of innate and cultural factors in moral judgment research is the linguistic analogy research program (Dwyer 2006; Hauser 2006; Mikhail 2011). This approach is explicitly modeled on Noam Chomsky’s (1965) generative linguistics. According to Chomsky, the capacity for language production and some basic structural parameters for functioning grammar are innate, but the enormous diversity of human languages comes about through myriad cultural settings and prunings within the evolutionarily allowed range of possible grammars. By analogy, then, moral grammar suggests that the capacity of making moral judgments is innate, along with some basic constraints on the substance of the moral domain—but within this evolutionarily enabled space, culture works to select and prune distinct local moralities.

The psychologist Jonathan Haidt (2012) has highlighted the role of cultural difference in the scientific study of morality through his influential Moral Foundations account. According to Haidt, all documented moral beliefs can be classified as fitting within a handful of moral sub-domains, such as harm-aversion, justice (in distribution of resources), respect for authority, and purity. Haidt argues that moral differences between cultures reflect differences in emphasis on these foundations. In fact, he suggests, industrialized western cultures appear to have emphasized the harm-aversion and justice foundations almost exclusively, unlike many other world cultures. Yet Haidt also insists upon a role for biology in explaining moral judgment; he sees each of the foundations as rooted in a particular evolutionary origin. Haidt’s foundations account remains quite controversial, but it is a prominent example of contemporary scientists’ focus on avoiding polarized answers to the biology versus culture question.

The rest of this article mostly sets aside further discussion of the distal—evolutionary or cultural—explanation of moral judgment. Instead it focuses on proximal cognitive explanations for particular moral judgments. This is because the ultimate aim is to consider the philosophical significance of moral cognitive science, whereas the moral philosophical uptake of debates over evolution is discussed elsewhere. Interested readers can see the articles on “Evolutionary Ethics” and “Moral Relativism.”

2. The Psychology of Moral Reasoning

One crucial question is whether moral judgments arise from conscious reasoning and reflection, or are triggered by unconscious and immediate impulses. In the 1970s and 1980s, research in moral psychology was dominated by the work of Lawrence Kohlberg (1971), who advocated a strongly rationalist conception of moral judgment. According to Kohlberg, mature moral judgment demonstrates a concern with reasoning through highly abstract social rules. Kohlberg asked his participants (boys and young men) to express opinions about ambiguous moral dilemmas and then explain the reasons behind their conclusions. Kohlberg took it for granted that his participants were engaged in some form of reasoning; what he wanted to find out was the nature and quality of this reasoning, which he claimed emerged through a series of developmental stages. This approach came under increasing criticism in the 1980s, particularly through the work of feminist scholar Carol Gilligan (1982), who exposed the trouble caused by Kohlberg’s reliance on exclusively male research participants. See the article “Moral Development” for further discussion of that issue.

Since the turn of the twentieth century psychologists have placed much less emphasis on the role of reasoning in moral judgment. A turning point was the publication of Jonathan Haidt’s paper, “The Emotional Dog and Its Rational Tail” (Haidt 2001). There Haidt discusses a phenomenon he calls “moral dumbfounding.” He presented his participants with provocative stories, such as a brother and sister who engage in deliberate incest, or a man who eats his (accidentally) dead dog. Asked to explain their moral condemnation of these characters, participants cited reasons that seem to be ruled out by the description of the stories—for instance, participants said that the incestuous siblings might create a child with dangerous birth defects, though the original story made clear that they took extraordinary precautions to avoid conception. When reminded of these details, participants did not withdraw their moral judgments; instead they said things like, “I don’t know why it’s wrong, it just is.” According to Haidt, studies of moral dumbfounding confirm a pattern of evidence that people do not really reason their way to moral conclusions. Moral judgment, Haidt argues, arrives in spontaneous and unreflective flashes, and reasoning is only something done post hoc, to rationalize the already-made judgments.

Many scientists and philosophers have written about the evidence Haidt discusses. A few take extremist positions, absolutely upholding the old rationalist tradition or firmly insisting that reasoning plays no role at all. (Haidt himself has slightly softened his anti-rationalism in later publications.) But it is probably right to say that the dominant view in moral psychology is a hybrid model. Many of our everyday moral judgments do arise in sudden flashes, without any need for explicit reasoning. But when faced with new and difficult moral situations, and sometimes even when confronted with explicit arguments against our existing beliefs, we are able to reason our way toward new moral judgments.

The dispute between rationalists and their antagonists is primarily one about procedure: are moral judgments produced through conscious thought or unconscious psychological mechanisms? There is a related but distinct set of issues concerning the quality of moral judgment. However moral judgment works, explicitly or implicitly, is it rational? Is the procedure that produces our moral judgments a reliable procedure? (It is important to note that a mental procedure might be unconscious and still be rational. Many of our simple mathematical calculations are accomplished unconsciously, but that alone does not keep them from counting as rational.)

There is considerable evidence suggesting that our moral judgments are unreliable (and so arguably irrational). Some of this evidence comes from experiments imported from other domains of psychology, especially from behavioral economics. Perhaps most famous is the Asian disease case first employed by the psychologists Amos Tversky and Daniel Kahneman in the early 1980s. Here is the text they gave to some of their participants:

Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat this disease have been proposed. Assume that the exact scientific estimate of the consequences of the programs are as follows:

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is a 1/3 probability that 600 will be saved, and 2/3 probability that no people will be saved. (Tversky and Kahneman 1981, 453)

Other participants read a modified form of this scenario, with the first option being that 400 people will die and the second option being that there is a 1/3 probability that nobody will die, and a 2/3 probability that 600 people will die. Notice that this is a change only in wording: the first option is the same either way, whether you describe it as 200 people being saved or 400 dying, and the second option gives the same probabilities of outcomes in either description. Notice also that, in terms of probability-expected outcome, the two programs are mathematically equivalent (for expected values, certain survival of 1/3 of people is equivalent to a 1/3 chance of all people surviving). Yet participants responded strongly to the way the choices are described; those who read the original phrasing preferred Program A three to one, while those who read the other phrasing showed almost precisely the opposite preference. Apparently, describing the choice in terms of saving makes people prefer the certain outcome (Program A) while describing it in terms of dying makes people prefer the chancy outcome (Program B).

This study is one of the most famous in the literature on framing effects, which shows that people’s judgments are affected by merely verbal differences in how a set of options is presented. Framing effects have been shown in many types of decision, especially in economics, but when the outcomes concern the deaths of large numbers of people it is clear that they are of moral significance. Many studies have shown that people’s moral judgments can be manipulated by mere changes in verbal description, without (apparently) any difference in features that matter to morality (see Sinnott-Armstrong 2008 for other examples).

Partly on this basis, some theorists have advocated a heuristics and biases approach to moral psychology. A cognitive bias is a systematic defect in how we think about a particular domain, where some feature influences our thinking in a way that it should not. Some framing effects apparently trigger cognitive biases; the saving/dying frame exposes a bias toward risk-taking in save frames and a bias away from it in dying frames. (Note that this leaves open whether either is the correct response—the point is that the mere difference of frame shouldn’t affect our responses, so at least one of the divergent responses must be mistaken.)

In the psychology of (non-moral) reasoning, a heuristic is a sort of mental short-cut, a way of skipping lengthy or computationally demanding explicit reasoning. For instance, if you want to know which of several similar objects is the heaviest, you could assume that it is the largest. Heuristics are valuable because they save time and are usually correct—but in certain types of cases an otherwise valuable heuristic will make predictable errors (some small objects are actually denser and so heavier than large objects). Perhaps some of our simple moral rules (“do no harm”) are heuristics of this sort—right most of the time, but unreliable in certain cases (perhaps it is sometimes necessary to harm people in emergencies). Some theorists (for example, Sunstein 2005) argue that large sectors of our ordinary moral judgments can be shown to exhibit heuristics and biases. If this is right, most moral judgment is systematically unreliable.

A closely related type of research shows that moral judgment is affected not only by irrelevant features within the questions (such as phrasing) but also by completely accidental features of the environment in which we make judgments. To take a very simple example: if you sit down to record your judgments about morally dubious activities, the severity of your reaction will depend partly on the cleanliness of the table (Schnall et al. 2008). You are likely to give a more harsh assessment of the bad behavior if the table around you is sticky and covered in pizza boxes than if it is nice and clean. Similarly, you are likely to render more negative moral judgments if you have been drinking bitter liquids (Eskine, Kacinik, and Prinz 2011), or handling dirty money (Yang et al. 2013). Watching a funny movie will make you temporarily less condemning of violence (Strohminger, Lewis, and Meyer 2011).

Some factors affecting moral judgment are internal to the judge rather than the environment. If you’ve been given a post-hypnotic suggestion to feel queasy whenever you hear a certain completely innocuous word, you will probably make more negative moral judgments about characters in stories containing those triggering words (Wheatley and Haidt 2005). Such effects are not restricted to laboratories; one Israeli study showed that parole boards are more likely to be lenient immediately after meals than when hungry (Danziger, Levav, and Avnaim-Pesso 2011).

Taken together, these studies appear to show that moral judgment is affected by factors that are morally irrelevant. The degree of wrongness of an act does not depend on which words are used to describe it, or the cleanliness of the desk at which it is considered, or whether the thinker has eaten recently. There seems to be strong evidence, then, that moral judgments are at least somewhat unreliable.  Moral judgment is not always of high quality. The extent of this problem, and what difference it might make to philosophical morality, is a matter that is discussed later in the article.

3. Interaction between Moral Judgment and Other Cognitive Domains

Setting aside for now the reliability of moral judgment, there are other questions we can ask about its production, especially about its causal structure. Psychologically speaking, what factors go into producing a particular moral judgment about a particular case? Are these the same factors that moral philosophers invoke when formally analyzing ethical decisions? Empirical research appears to suggest otherwise.

One example of this phenomenon concerns intention. Most philosophers have assumed that in order for something done by a person to be evaluable as morally right or wrong, the person must have acted intentionally (at least in ordinary cases, setting aside negligence). If you trip me on purpose, that is wrong, but if you trip me by accident, that is merely unfortunate. Following this intuitive point, we might think that assessment of intentionality is causally prior to assessment of morality. That is, when I go to evaluate a potentially morally important situation, I first work out whether the person involved acted intentionally, and then use this judgment as an input to working out whether what they have done is morally wrong. Hence, in this simple model, the causal structure of moral judgment places it downstream from intentionality judgment.

But empirical evidence suggests that this simple model is mistaken. A very well-known set of studies concerning the side-effect effect (also known as the Knobe effect, for its discoverer Joshua Knobe) appears to show that the causal relationship between moral judgment and intention-assessment is much more complicated. Participants were asked to read short stories like the following:

The vice-president of a company went to the chairman of the board and said, “We are thinking of starting a new program. It will help us increase profits, but it will also harm the environment.” The chairman of the board answered, “I don’t care at all about harming the environment. I just want to make as much profit as I can. Let’s start the new program.” They started the new program. Sure enough, the environment was harmed. (Knobe 2003, 191)

Other participants read the same story, except that the program would instead help, rather than harm, the environment as a side effect. Both groups of participants were asked whether the executive intentionally brought about the side effect. Strikingly, when the side effect was a morally bad one (harming the environment), 82% of participants thought it was brought about intentionally, but when the side effect was morally good (helping the environment) 77% of participants thought it was not brought about intentionally.

There is a large literature offering many different accounts of the side-effect effect (which has been repeatedly experimentally replicated). One plausible account is this: people sometimes make moral judgments prior to assessing intentionality. Rather than intentionality-assessment always being an input to moral judgment, sometimes moral judgment feeds input to assessing intentionality. A side effect judged wrong is more likely to be judged intentional than one judged morally right. If this is right, then the simple model of the causal structure of moral judgment cannot be quite correct—the causal relation between intentionality-assessment and moral judgment is not unidirectional.

Other studies have shown similar complexity in how moral judgment relates to other philosophical concepts. Philosophers have often thought that questions about causal determinism and free will are conceptually prior to attributing moral responsibility. That is, whether or not we can hold people morally responsible depends in some way on what we say about freedom of the will. Some philosophers hold that determinism is compatible with moral responsibility and others deny this, but both groups start from thinking about the metaphysics of agency and move toward morality. Yet experimental research suggests that the causal structure of moral judgment works somewhat differently (Nichols and Knobe 2007). People asked to judge whether an agent can be morally responsible in a deterministic universe seem to base their decision in part on how strongly they morally evaluate the agent’s actions. Scenarios involving especially vivid and egregious moral violations tend to produce greater endorsement of compatibilism than more abstract versions. The interpretation of these studies is highly controversial, but at minimum they seem to cast doubt on a simple causal model placing judgments about free will prior to moral judgment.

A similar pattern holds for judgments about the true self. Some moral philosophers hold that morally responsible action is action resulting from desires or commitments that are a part of one’s true or deep self, rather than momentary impulses or external influences. If this view reflects how moral judgments are made, then we should expect people to first assess whether a given action results from an agent’s true self and then render moral judgment about those that do. But it turns out that moral judgment provides input to true self assessments. Participants in one experiment (Newman, Bloom, and Knobe 2014) were asked to consider a preacher who explicitly denounced homosexuality while secretly engaging in gay relationships. Asked to determine which of these behaviors reflected the preacher’s true self, participants first employed their own moral judgment; those disposed to see homosexuality as morally wrong thought the preacher’s words demonstrated his true self, while those disposed to accept homosexuality thought the preacher’s relationships came from his true self. The implication is that moral judgment is sometimes a causal antecedent of other types of judgments, including those that philosophers have thought conceptually prior to assessing morality.

4. The Neuroanatomy of Moral Judgment

Physiological approaches to the study of moral judgment have taken on an increasingly important role. Employing neuroscientific and psychopharmacological research techniques, these studies help to illuminate the functional organization of moral judgment by revealing the brain areas implicated in its exercise.

An especially central concern in this literature is the role of emotion in moral judgment. An early influential study by Jorge Moll and colleagues (2005) demonstrated selective activity for moral judgment in a network of brain areas generally understood to be central to emotional processing. This work employed functional magnetic resonance imaging (fMRI), the workhorse of modern neuroscience, in which a powerful magnet is used to produce a visual representation of relative levels of cellular energy used in various brain areas. Employing fMRI allows researchers to get a near-real-time representation of the brain’s activities while the conscious subject makes judgments about morally important scenarios.

One extremely influential fMRI study of moral judgment was conducted by philosopher-neuroscientist Joshua D. Greene and colleagues. They compared brain activity in people making deontological moral judgments with brain activity while making utilitarian moral judgments. (To oversimplify: a utilitarian moral judgment is one primarily attentive to the consequences of a decision, even allowing the deliberate sacrifice of an innocent to save a larger number of others. Deontological moral judgment is harder to define, but for our purposes means moral judgment that responds to factors other than consequences, such as the individual rights of someone who might be sacrificed to save a greater number. See “Ethics.”

In a series of empirical studies and philosophical papers, Greene has argued that his results show that utilitarian moral judgment correlates with activity in cognitive or rational brain areas, while deontological moral judgment correlates with activity in emotional areas (Greene 2008). (He has since softened this view a bit, conceding that both types of moral judgment allow some form of emotional processing. He now holds that deontological emotions are a type that trigger automatic behavioral responses, whereas utilitarian emotions are flexible prompts to deliberation (Greene 2014).) According to Greene, learning these psychological facts give us reason to distrust our deontological judgments; in effect, his is a neuroscience-fueled argument on behalf of utilitarianism. This argument is at the center of a still very spirited debate. Whatever its outcome, Greene’s research program has had an undeniable influence on moral psychology; his scenarios (which derive from philosophical thought experiments by Philippa Foot and Judith Jarvis Thomson) have been adopted as standard across much of the discipline, and the growth of moral neuroimaging owes much to his project.

Alongside neuroimaging, lesion study is one of the central techniques of neuroscience. Recruiting research participants who have pre-existing localized brain damage (often due to physical trauma or stroke) allows scientists to infer the function of a brain area from the behavioral consequences of its damage. For example, participants with damage to the ventromedial prefrontal cortex, who have persistent difficulties with social and emotional processing, were tested on dilemmas similar to those used by Greene (Koenigs et al. 2007). These patients show a greater tendency toward utilitarian judgments than did healthy controls. Similar lesion studies have since found a range of different results, so the empirical debate remains unsettled, but the technique continues to be important.

Two newer neuroscientific research techniques have begun to play important roles in moral psychology. Transcranial magnetic simulation (TMS) uses blasts of electromagnetism to suppress or heighten activity in a brain region. In effect, this allows researchers to (temporarily) alter healthy brains and correlate this alteration with behavioral effects. For instance, one study (Young et al. 2010) used TMS to suppress activity in the right temporoparietal junction, an area associated with assessing the mental states of others (see “Theory of Mind.” After the TMS treatment, participants’ moral judgments showed less sensitivity to whether characters in a dilemma acted intentionally or accidentally. Another technique, transcranial direct current stimulation (TCDS) has been shown to increase compliance with social norms when applied to the right lateral prefrontal cortex (Ruff, Ugazio, and Fehr 2013).

Finally, it is possible to study the brain not only at the gross structural scale, but also by examining its chemical operations. Psychopharmacology is the study of the cognitive and behavioral effects of chemical alteration of the brain. In particular, the levels of neurotransmitters, which regulate brain activity in a number of ways, can be manipulated by introducing pharmaceuticals. For example, participants’ readiness to make utilitarian moral judgments can be altered by administration of the drugs propranolol (Terbeck et al. 2013) and citalopram (Crockett et al. 2010).

5. What Do Moral Philosophers Think of Cognitive Science?

So far this article has described the existing science of moral judgment: what we have learned empirically about the causal processes by which moral judgments are produced. The rest of the article discusses the philosophical application of this science. Moral philosophers try to answer substantive ethical questions: what are the most valuable goals we could pursue? How should we resolve conflicts among these goals? Are there ways we should not act even if doing so would promote the best outcome? What is the shape of a good human life and how could we acquire it? How is a just society organized? And so on.

What cognitive science provides is causal information about how we typically respond to these kinds of questions. But philosophers disagree about what to make of this information. Should we ever change our answers to ethical questions on the basis of what we learn about their causal origins? Is it ever reasonable (or even rationally mandatory) to abandon a confident moral belief because of a newly learned fact about how one came to believe it?

We now consider several prominent responses to these questions. We start with views that deny much or any role for cognitive science in moral philosophy. We then look at views that assign to cognitive science a primarily negative role, in disqualifying or diminishing the plausibility of certain moral beliefs. We conclude by examining views that see the findings of cognitive science as playing a positive role in shaping moral theory.

6. Moral Cognition and the Is/Ought Distinction

We must start with the famous is/ought distinction. Often attributed to Hume (see “Hume“), the distinction is a commonsensical point about the difference between descriptive claims that characterize how things are (for example, “the puppy was sleeping”) and prescriptive claims that assert how things should or should not be (for instance, “it was very wrong of you to kick that sleeping puppy”). These are widely taken to be two different types of claims, and there is a lot to be said about the relationship between them. For our purposes, we may gloss it as follows: people often make mistakes when they act as if an ought-claim follows immediately and without further argument from an is-claim. The point is not (necessarily) that it is always a mistake to draw an ought-statement as the conclusion of an is-statement, just that the relationship between them is messy and it is easy to get confused.  Some philosophers do assert the much stronger claim that ought-statements can never be validly drawn from is-statements, but this is not what everyone means when the issue is raised.

For our purposes, we are interested in how the is/ought distinction might matter to applying cognitive science to moral philosophy. Cognitive scientific findings are is-type claims; they describe facts about how our minds actually do work, not about how they ought to work. Yet the kind of cognitive scientific claims at interest here are claims about moral cognition—is-claims about the origin of ought-claims. Not surprisingly then, if the is/ought distinction tends to mark moments of high risk for confusion, we should expect this to be one of those moments. Some philosophers have argued that attempts to change moral beliefs on the basis of cognitive scientific findings are indeed confusions of this sort.

a. Semantic Is/Ought

In the mid-twentieth century it was popular to understand the is/ought distinction as a point about moral semantics. That is, the distinction pointed to a difference in the implicit logic of two uses of language. Descriptive statements (is-statements) are, logically speaking, used to attribute properties to objects; “the puppy is sleeping” just means that the sleeping-property attaches to the puppy-object. But prescriptive statements (ought-statements) do not have this logical structure. Their surface descriptive grammar disguises an imperative or expressive logic. So “it was very wrong of you to kick that sleeping puppy” is not really attributing the property of wrongness to your action of puppy-kicking. Rather, the statement means something like “don’t kick sleeping puppies!” or even “kicking sleeping puppies? Boo!” Or perhaps “do not like it when sleeping puppies are kicked and I want you to not like it as well.” (See “Ethical Expressivism.”)

If this analysis of the semantics of moral statements is right, then we can easily see why it is a mistake to draw ought-conclusions from is-premises. Logically speaking, simple imperatives and expressives do not follow from simple declaratives. If you agree with “the puppy was sleeping” yet refuse to accept the imperative “don’t kick sleeping puppies!” you haven’t made a logical mistake. You haven’t made a logical mistake even if you also agree with “kicking sleeping puppies causes them to suffer.” The point here isn’t about the moral substance of animal cruelty—the point is about the logical relationship between two types of language use. There is no purely logical compulsion to accept any simple imperative or expressive on the basis of any descriptive claim, because the types of language do not participate in the right sort of logical relationship.

Interestingly, this sort of argument has not played much of a role in the debate about moral cognitive science, though seemingly it could. After all, cognitive scientific claims are, logically speaking, descriptive claims, so we could challenge their logical relevance to assessing any imperative or expressive claims. But, historically speaking, the rise of moral cognitive science came mostly after the height of this sort of semantic argument. Understanding the is/ought distinction in semantic terms like these had begun to fade from philosophical prominence by the 1970s, and especially by the 1980s when modern moral cognitive science (arguably) began. Contemporary moral expressivists are often eager to explain how we can legitimately draw inferences from is to ought despite their underlying semantic theory.

It is possible to see the simultaneous decline of simple expressivism and the rise of moral cognitive science as more than mere coincidence.  Some historians of philosophy see the discipline as having pivoted from the linguistic turn of the late 19th and early 20th century to the cognitive turn (or conceptual turn) of the late 20th century. Philosophy of language, while still very important, has receded from its position at the center of every philosophical inquiry. In its place, at least in some areas of philosophy, is a growing interest in naturalism and consilience with scientific inquiry. Rather than approaching philosophy via the words we use, theorists often now approach it through the concepts in our minds—concepts which are in principle amenable to scientific study. As philosophers became more interested in the empirical study of their topics, they were more likely to encourage and collaborate with psychologists. This has certainly contributed to the growth of moral cognitive science since the 1990s.

b. Non-semantic Is/Ought

Setting aside semantic issues, we still have a puzzle about the is/ought distinction. How do we get to an ought conclusion from an is premise? The idea that prescriptive and descriptive claims are different types of claim retains its intuitive plausibility. Some philosophers have argued that scientific findings cannot lead us to (rationally) change our moral beliefs because science issues only descriptive claims. It is something like a category mistake to allow our beliefs in a prescriptive domain to depend crucially upon claims within a descriptive domain. More precisely, it is a mistake to revise your prescriptive moral beliefs because of some purely descriptive fact, even if it is a fact about those beliefs.

Of course, the idea here cannot be that it is always wrong to update moral beliefs on the basis of new scientific information. Imagine that you are a demolition crew chief and you are just about to press the trigger to implode an old factory. Suddenly one of your crew members shouts, “Wait, look at the thermal monitor! There’s a heat signature inside the factory—that’s probably a person! You shouldn’t press the trigger!” It would be extremely unfitting for you to reply that whether or not you should press the trigger cannot depend on the findings of scientific contraptions like thermal monitors.

What this example shows is that the is/ought problem, if it is a problem, is obviously not about excluding all empirical information from moral deliberation. But it is important to note that the scientific instrument itself does not tell us what we ought to do. We cannot just read off moral conclusions from descriptive scientific facts. We need some sort of bridging premise, something that connects the purely descriptive claim to a prescriptive claim. In the demolition crew case, it is easy to see what this bridging premise might be, something like: “if pressing a trigger will cause the violent death of an innocent person, then you should not press the trigger.” In ordinary moral interactions we often leave bridging premises implicit—it is obvious to everyone in this scenario that the presence of an innocent human implies the wrongness of going ahead with implosion.

But there is a risk in leaving our bridging premises implicit. Sometimes people seem to be relying upon implicit bridging premises that are not mutually agreed on, or that may not make any sense at all. Consider: “You shouldn’t give that man any spare change. He is a Templar.” Here you might guess that the speaker thinks Templars are bad people and do not deserve help, but you might not be sure—why would your interlocutor even care about Templars? And you are unlikely to agree with this premise anyway, so unless the person makes their anti-Templar views explicit and justifies them, you do not have much reason to follow their moral advice. Another example: “The tiles in my kitchen are purple, so it’s fine for you to let those babies drown.” It is actually hard to interpret this utterance as something other than a joke or metaphor. If someone tried to press it seriously, we would certainly demand to be informed of the bridging premise between linoleum color and nautical infanticide, and we would be skeptical that anything plausible might be provided.

So far, so simple. Now consider: “Brain area B is extremely active whenever you judge it morally wrong to cheat on your taxes. So it is morally wrong to cheat on your taxes.” What should we make of this claim? The apparent implicit bridging premise is: If brain area B gets excited when you judge it wrong to X, then it is wrong to X. But this is a very strange bridging premise. It does not make reference to any internal features of tax-cheating that might explain its wrongness. In fact, the premise appears to suggest that an act can be wrong simply because someone thinks it is wrong and some physical activity is going on inside that person’s body. This is not the sort of thing we normally offer to get to moral conclusions, and it is not immediately clear why anyone would find it convincing. Perhaps, as we said, this is an example of how easy it is to get confused about is and ought claims.

Two points come out here. First, some attempts at drawing moral conclusions from cognitive science involve implicit bridging premises that fall apart when anyone attempts to make them explicit. This is often true of popular science journalism, in which some new psychological finding is claimed to prove that such-and-such moral belief is mistaken. At times, philosophers have accused their psychologist or philosopher opponents of doing something similar. According to Berker (2009), Joshua Greene’s neuroscience-based debunking of deontology (discussed above) lacks a convincing bridging premise. Berker suggest that Greene avoids being explicit about this premise, because when it is made explicit it is either a traditional moral argument which does not use cognitive science to motivate its conclusion or it employs cognitive scientific claims but does not lead to any plausible moral conclusion. Hence, Berker says, the neuroscience is normatively insignificant; it does not play a role in any plausible bridging premise to a moral conclusion.

Of course, even if this is right, then it implies only that Greene’s argument fails. But Berker and other philosophers have expressed doubt that any cognitive science-based argument could generate a moral conclusion. They suggest that exciting newspaper headlines and book subtitles (for example, How Science Can Determine Human Values (Harris 2010)) trade on our leaving their bridging premises implicit and unchallenged. For—this is the second point—if there were a successful bridging principle, it would be a very unusual one. Why, we want to know, could the fact that such-and-such causal process precedes or accompanies a moral judgment give us reason to change our view about that moral judgment? Coincident causal processes do not appear to be morally relevant. What your brain is doing while you are making moral judgments seems to be in the same category as the color of your kitchen tiles—why should it matter?

Of course, the fact that causal processes are not typically used in bridging premises does not show us that they could not. But it does perhaps give us reason to be suspicious, and to insist that the burden of proof is on anyone who wishes to present such an argument. They must explain to us why their use of cognitive science is normatively significant. A later section considers different attempts to meet this burden of proof. First, though, consider an argument that it can never be met, even in principle.

7. The Autonomy of Morality

Some philosophers hold that it is a mistake to try to draw moral conclusions from psychological findings because doing so misunderstands the nature of moral deliberation. According to these philosophers, moral deliberation is essentially first-personal, while cognitive science can give us only third-personal forms of information about ourselves. When you are trying to decide what to do, morally speaking, you are looking for reasons that relate your options to the values you uphold. I have moral reason not to kick puppies because I recognize value in puppies (or at least the non-suffering of puppies). Psychological claims about your brain or psychological apparatus might be interesting to someone else observing you, but they are beside the point of first-personal moral deliberation about how to act.

Here is how Ronald Dworkin makes this point. He asks us to imagine learning some new psychological fact about why we have certain beliefs concerning justice. Suppose that the evidence suggests our justice beliefs are caused by self-interested psychological processes. Then:

It will be said that it is unreasonable for you still to think that justice requires anything, one way or the other. But why is that unreasonable? Your opinion is one about justice, not about your own psychological processes . . . You lack a normative connection between the bleak psychology and any conclusion about justice, or any other conclusion about how you should vote or act. (Dworkin 1996, 124–125)

The idea here is not just that beliefs about morality and beliefs about psychology are about different topics. Part of the point is that morality is special—it is not just another subject of study alongside psychology and sociology and economics. Some philosophers put this point in terms of an a priori / a posteriori distinction: morality is something that we can work out entirely in our minds, without needing to go and do experiments (though of course we might need empirical information to apply moral principles once we have figured them out). Notice that when philosophers debate moral principles with one another, they do not typically conduct or make reference to empirical studies. They think hard about putative principles, come up with test cases that generate intuitive verdicts, and then think hard again about how to modify principles to make them fit to these verdicts. The process does not require empirical input, so there is no basis for empirical psychology to become involved.

This view is sometimes called the autonomy of morality (Fried 1978; Nagel 1978). It holds that, in the end, the arbiter of our moral judgments will be our moral judgments—not anything else. The only way you can get a moral conclusion from a psychological finding is to provide the normative connection that Dworkin refers to. So, for instance: if you believe that it is morally wrong to vote in a way triggered by a selfish psychological process, then finding out that your intending to vote for the Egalitarian Party is somehow selfishly motivated could give you a reason to reconsider your vote. But notice that this still crucially depends upon a moral judgment—the judgment that it is wrong to vote in a way triggered by selfish psychological processes. There is no way to get entirely out of the business of making moral judgments; psychological claims are morally inert unless accompanied by explicitly moral claims.

The point here can be made in weaker and stronger forms. The weaker form is simply this: empirical psychology cannot generate moral conclusions entirely on its own. This weaker form is accepted by most philosophers; few deny that we need at least some moral premises in order to get moral conclusions. Notice though that even the weaker form casts doubt on the idea that psychology might serve as an Archimedean arbiter of moral disagreement. Once we’ve conceded that we must rely upon moral premises to get results from psychological premises, we cannot claim that psychology is a value-neutral platform from which to settle matters of moral controversy.

The stronger form of the autonomy of morality claim holds that a psychological approach to morality fundamentally misunderstands the topic. Moral judgment, in this view, is about taking agential responsibility for our own value and actions. Re-describing these in psychological terms, so that our commitments are just various causal levers, represents an abdication of this responsibility. We should instead maintain a focus on thinking through the moral reasons for and against our positions, leaving psychology to the psychologists.

Few philosophers completely accept the strongest form of the autonomy of morality. That is, most philosophers agree that there are at least some ways psychological genealogy could change the moral reasons we take ourselves to have. But it will be helpful for us to keep this strong form in mind as a sort of null hypothesis. As we now turn to various theoretical arguments in support of a role for cognitive science in moral theory, we can understand each as a way of addressing the strong autonomy of morality challenge. They are arguments demonstrating that psychological investigation is important to our understanding of our moral commitments.

8. Moral Cognition and Moral Epistemology

Many moral philosophers think about their moral judgments (or intuitions) as pieces of evidence in constructing moral theories. Rival moral principles, such as those constituting deontology and consequentialism, are tested by seeing if they get it right on certain important cases.

Take the well-known example of the Trolley Problem. An out-of-control trolley is rumbling toward several innocents trapped on a track. You can divert the trolley, but only by sending it onto a side track where it will kill one person. Most people think it is morally permissible to divert the trolley in this case. Now imagine that the trolley cannot be diverted—it can be stopped only by physically pushing a large person into its path, causing an early crash. Most people do not think this is morally permitted. If we treat these two intuitive reactions as moral evidence, then we can affirm that how an individual is killed makes a difference to the rightfulness of sacrificing a smaller number of lives to save a larger number. Apparently, it is permissible to indirectly sacrifice an innocent as a side-effect of diverting a threat, but not permissible to directly use an innocent as a means to stop the threat. This seems like preliminary evidence against a moral theory that says that outcomes are the only things that matter morally, since the outcomes are identical in these two cases.

This sort of reasoning is at the center of how most philosophers practice normative ethics. Moral principles are proposed to account for intuitive reactions to cases. The best principles are (all else equal) those that cohere with the largest number of important intuitions. A philosopher who wishes to challenge a principle will construct a clever counterexample: a case where it just seems obvious that it is wrong to do X, but the targeted principle allows us to do X in the case. Proponents of the principle must now (a) show that their principle has been misapplied and actually gives a different verdict about the case; (b) accept that the principle has gone wrong but alter it to give a better answer; (c) bite the bullet and insist that even if the principle seems to have gone wrong here, it still trustworthy because it is right in so many other cases; or (d) explain away the problematic intuition, by showing that the test case is underdescribed or somehow unfair, or that the intuitive reaction itself likely results from confusion. If all of this sounds a bit like the testing of scientific hypotheses against experimental data, that is no accident. The philosopher John Rawls (1971) explicitly modeled this approach, which he called “reflective equilibrium,” See “Rawls” on hypothesis testing in science.

There are various ways to understand what reflective equilibrium aims at doing. In a widely accepted interpretation, reflective equilibrium aims at discovering the substantive truths of ethics. In this understanding, those moral principles supported in reflective equilibrium are the ones most likely to be the moral truth. (We will leave aside here How to interpret truth in the moral domain is left aside in this article, but see “Metaethics.”) Intuitions about test cases are evidence for moral truth in much the same way that scientific observations are evidence for empirical truth. In science, our confidence in a particular theory depends on whether it gets evidential support from repeated observations, and in moral philosophy (in this conception) our confidence in a particular ethical theory depends on whether it gets evidential confirmation from multiple intuitions.

This parallel allows us to see one way in which the psychology of moral judgment might be relevant to moral philosophy. When we are testing a scientific theory, our trust in any experimental observation depends on our confidence in the scientific instruments used to generate them. If we come to doubt the reliability of our instruments, then we should doubt the observations we get from them, and so should doubt the theories they appear to support. What, then, if we come to doubt the reliability of our instruments in moral philosophy? Of course, philosophers do not use microscopes or mass spectrometers. Our instruments are nothing more than our own minds—or, more precisely, our mental abilities to understand situations and apply moral concepts to them. Could our moral minds be unreliable in the way that microscopes can be unreliable?

This is certainly not an idle worry, since we know that we make consistent mental mistakes in other domains. Think about persistent optical illusions or predictably bad mathematical reasoning. There is an enormous literature in cognitive science showing that we make consistent mistakes in domains other than morality. Against this background, it would be a surprise if our moral intuitions turned out not to be full of mistakes.

In earlier sections we discussed psychological evidence showing systematic deficits in moral reasoning. We saw, for instance, that people’s moral judgments are affected by the verbal framing in which test cases are presented (save versus die) and by the cleanliness of their immediate environment. If the readings of a scientific instrument appeared to be affected by environmental factors that had nothing to do with what the instrument was supposed to measure, then we would rightly doubt the trustworthiness of readings obtained from that instrument. In moral philosophy, a mere difference in verbal framing, or the dirtiness of the desk you are now sitting at, certainly do not seem like things that matter to whether or not a particular act is permissible. So it seems that our moral intuitions, like defective lab equipment, sometimes cannot be trusted.

Note how this argument relates to earlier claims about the autonomy of morality or the is/ought distinction. Here no one is claiming that cognitive science tells us directly which moral judgments are right and which wrong. All cognitive science can do is show us that particular intuitions are affected by certain causal factors—it cannot tell us which causal factors count as distorting and which are acceptable. It is up to us, as moral judges, to determine that differences of verbal framing (save/die) or desk cleanliness do not lead to genuine moral differences. Of course, we do not have to think very hard to decide that these causal factors are morally irrelevant—but the point remains that we are still making moral judgments, even very easy ones, and not getting our verdict directly from cognitive science.

Many proponents of a role for cognitive science in morality are willing to concede this much: in the end, any revision of our moral judgments will be authorized only by some other moral judgments, not by the science itself. But, they will now point out, there are some revisions of our moral judgments we ought to make, and we are able to make only because of input from cognitive science.  We all agree that our moral judgments should not be affected by the cleanliness of our environment, and the science is unnecessary for our agreeing on this. But we would not know about the fact that our moral judgments are affected by environmental cleanliness without cognitive science. So, in this sense at least, improving the quality of our moral judgments does seem to require the use of cognitive science. Put more positively: paying attention to the cognitive science of morality can allow us to realize that some seemingly reliable intuitions are not in fact reliable. Once these are set aside like broken microscopes, we can have greater confidence that the theories we build from the remainder will capture the moral truth. (See, for example, Sinnott-Armstrong 2008; Mason 2011.)

This argument is one way of showing the relevance of cognitive science to morality. Note that it is an epistemic debunking argument. The argument undermines certain moral intuitions as default sources of evidential justification. Cognitive science plays a debunking role by exposing the unreliable causal origins of our intuitions. Philosophers who employ debunking arguments like this do so with a range of aims. Some psychological debunking arguments are narrowly targeted, trying to show that a few particular moral intuitions are unjustified. Often this is meant to be a move within normative theory, weakening a set of principles the philosopher rejects. For instance, if intuitions supporting deontological moral theory are undermined, then we may end up dismissing deontology. Other times, philosophers intend to aim psychological debunking arguments much more widely. If it can be shown that all moral intuitions are undermined in this way, then we have grounds for skepticism about moral judgment [see “Moral Epistemology.”] The plausibility of all these arguments remains hotly debated. But if any of them should turn out to be convincing, then we have a clear demonstration of how cognitive science can matter to moral theory.

9. Non-epistemic Approaches

Epistemic debunking arguments presuppose that morality is best understood as an epistemic domain. That is, these arguments assume that there is (or could be) a set of moral truths, and that the aim of moral judgment is to track these moral truths. What if we do not accept this conception of the moral domain? What if we do not expect there to be moral truths, or do not think that moral intuition aims at tracking any such truth? In that case, should we care about the cognitive science of morality?

a. Consistency in Moral Reasoning

Obviously the answer to this question will depend on what we think the moral domain is, if it is not an epistemic domain. One common view, often associated with contemporary Kantians, is that morality concerns consistency in practical reasoning. Though we do not aim to uncover independent moral truths, we can still say that some moral beliefs are better than others, because some moral beliefs are better at cohering with the way in which we conceive of ourselves, or what we think we have most reason to do. In this understanding, a bad moral belief is not bad because it is mistaken about the moral facts (there are no moral facts) but is bad because it does not fit well with our other normative beliefs. The assumption here is that we want to be coherent, in that being a rational agent means aiming at consistency in the beliefs that ground one’s actions. [See “Moral Epistemology.”]

In this conception of the moral domain, cognitive science is useful because it can show us when we have unwittingly stumbled into inconsistency in our moral beliefs. A very simple example comes from the universalizability condition on moral judgment. It is incoherent to render different moral judgments about two people doing the exact same thing in the exact same circumstances. This is because morality (unlike taste, for instance) aims at universal prescription. If it is wrong for you to steal from me, then it is wrong for me to steal from you. (Assuming you and I are similarly situated—if I am an unjustly rich oligarch and you are a starving orphan, maybe things are different. But then this moral difference is due to different features of our respective situations. The point of universal prescription is to deny that there can be moral differences when situations are similar.) A person who explicitly denied that moral requirements applied equally to herself as to other people would not seem to really have gotten the point of morality. We would not take such a person very seriously when she complained of others violating her moral rights, if she claimed to be able to ignore those same rights in others.

We all know that we fail at universalizing our moral judgments sometimes; we all suffer moments of weakness where we try to make excuses for our own moral failings, excuses we would not permit to others. But some psychological research suggests that we may fail in this way far more often than we realize. Nadelhoffer and Feltz (2008) found that people make different judgments about moral dilemmas when they imagine themselves in the dilemma than when imagining others in the same dilemma. Presumably most people would not explicitly agree that there is such a moral difference, but they can be led into endorsing differing standards depending on whether they are presented with a me versus someone else framing of the case.  This is an unconscious failure of universalization, but it is still an inconsistency. If we aim at being consistent in our practical reasoning, we should want to be alerted to unconscious inconsistencies, so that we might get a start on correcting them. And in cases like this one, we do not have introspective access to the fact that we are inconsistent, but we can learn it from cognitive science.

Note how this argument parallels the epistemic one. The claim is, again, not that cognitive science tells us what counts as a good moral judgment. Rather cognitive science reveals to us features of the moral judgments we make, and we must then use moral reasoning to decide whether these features are problematic. Here the claim is that inconsistency in moral judgments is bad, because it undermines our aim to be coherent rational agents. We do not get that claim from cognitive science, but there are some cases where we could not apply it without the self-knowledge we gain from cognitive science. Hence the relevance of cognitive science to morality as aimed at consistency.

b. Rational Agency

There is another way in which cognitive science can matter to the coherent rational agency conception of morality. Some findings in cognitive science may threaten the intelligibility of this conception altogether. Recall, from section 2, the psychologist Jonathan Haidt’s work on moral dumbfounding; people appear to spontaneously invent justifications for their intuitive moral verdicts, and stick with these verdicts even after the justifications are shown to fail. If pressed hard enough, people will admit they simply do not know why they came to the verdicts, but hold to them nevertheless. In Haidt’s interpretation, these findings show that moral judgment happens almost entirely unconsciously, with conscious moral reasoning mostly a post hoc epiphenomenon.

If Haidt is right, point out Jeanette Kennett and Cordelia Fine (2009), then this poses a serious problem for the ideal of moral agency. For us to count as moral agents, there needs to be the right sort of connection between our conscious reasoning and our responses to the world. A robot or a simple animal can react, but a rational agent is one that can critically reflect upon her reasons for action and come to a deliberative conclusion about what she ought to do. Yet if we are morally dumbfounded in the way Haidt suggests, then our conscious moral reasoning may lack the appropriate connection to our moral reactions. We think that we know why we judge and act as we do, but actually the reasons we consciously endorse are mere post hoc confabulations.

In the end, Kennett and Fine argue that Haidt’s findings do not actually lead to this unwelcome conclusion. They suggest that he has misinterpreted what the experiments show, and that there is a more plausible interpretation that preserves the possibility of conscious moral agency. Note that responding in this way concedes that cognitive science might be relevant to assessing the status of our moral judgments. The dispute here is only over what the experiments show, not over what the implications would be if they showed a particular thing. This leaves the door open for further empirical research on conscious moral agency.

One possible approach is the selective application of Haidt’s argument. If it could be shown that certain moral judgments—those about a particular topic or sub-domain of morality—are especially prone to moral dumbfounding, then we might have the basis for disqualifying them from inclusion in reflective moral theory. This seems, at times, to be the approach adopted by Joshua Greene (see section 4) in his psychological attack on deontology. According to Greene (2014), deontological intuitions are of a psychological type distinctively disconnected from conscious reflection and should accordingly be distrusted. Many philosophers dispute Greene’s claims (see, for instance, Kahane 2012), but this debate itself shows the richness of engagement between ethics and cognitive science.

c. Intersubjective Justification

There is one further way in which cognitive science may have relevance to moral theory. In this last conception, morality is essentially concerned with intersubjective justification. Rather than trying to discover independent moral truths, my moral judgments aim at determining when and how my preferences can be seen as reasonable by other people. A defective moral judgment, in this conception, is one that reflects only my own personal idiosyncrasies and so will not be acceptable to others. For instance, perhaps I have an intuitive negative reaction to people who dress their dogs in sweaters even when it is not cold. If I come to appreciate that my revulsion of this practice is not widely shared, and that other people cannot see any justification for it, then I may conclude that it is not properly a moral judgment at all. It may be a matter of personal taste, but it cannot be a moral judgment if it has no chance of being intersubjectively justified.

Sometimes we can discover introspectively that our putative moral judgments are actually not intersubjectively justifiable, just by thinking carefully about what justifications we can or cannot offer. But there may be other instances in which we cannot discover this introspectively, and where cognitive science may help. This is especially so when we have unknowingly confabulated plausible-sounding justifications in order to make our preferences appear more compelling than they are (Rini 2013). For example, suppose that I have come to believe that a particular charity is the most deserving of our donations, and I am now trying to convince you to agree. You point out that other charities seem to be at least as effective, but I insist. By coincidence, the next day I participate in a psychological study of color association. The psychologists’ instruments notice that I react very positively to the color forest green—and then I remember that my favorite charity’s logo is a deep forest green. If this turns out to be the explanation for why I argued for the charity, then I should doubt that I have provided an intersubjective justification. Maybe my fondness for the charity’s logo is an okay reason for me to make a personal choice (if I am otherwise indifferent), but it certainly is not a reason for you to agree. Now that I am aware of this psychological influence, I should consider the possibility that I have merely confabulated the reasons I offered to you.

10. Objections and Alternatives

The preceding sections have focused on negative implications of the cognitive science of moral judgment. We have seen ways in which learning about a particular moral judgment’s psychological origins might lead us to disqualify it, or at least reduce our confidence in it. This final section briefly considers some objections to drawing such negative implications, and also discusses more positive proposals for the relationship between cognitive science and moral philosophy.

a. Explanation and Justification

One objection to disqualifying a moral judgment on cognitive scientific grounds is that this involves confusion between reasons of explanation and justification. The explanatory reason for the fact that I judge X to be immoral could be any number of psychological factors. But my justifying reason for the judgment is unlikely to be identical with the explanatory reason. Consider my judgment that it is wrong to tease dogs with treats that will not be provided. Perhaps the explanatory reason for my believing this is that my childhood dog bit me when I refused to share a sandwich. But this is not how I justify my judgment—the justifying reason I have is that dogs suffer when led to form unfulfilled expectations, and the suffering of animals is a moral bad. As long as this is a good justifying reason, then the explanatory reason does not really matter. So, runs the objection, those who disqualify moral judgments on cognitive scientific grounds are looking at the wrong thing—they should be asking about whether the judgment is justified, not why (psychologically speaking) it was made (Kamm 1998; van Roojen 1999).

One problem with this objection is that it assumes we have a basis for affirming the justifying reasons for a judgment that are unaffected by cognitive scientific investigation. Obviously if we had oracular certainty that judgment X is correct, then we should not worry about how we came to make the judgment. But in moral theory we rarely (if ever) have such certainty. As discussed earlier (see section 8), our justification for trusting a particular judgment is often dependent upon how well it coheres with other judgments and general principles. So if a cognitive scientific finding showed that some dubious psychological process is responsible for many of our moral judgments, their ability to justify one another may be in question. To see the point, consider the maximal case: suppose you learned that all of your moral judgments were affected by a chemical placed in the water supply by the government. Would this knowledge not give you some reason to second-guess your moral judgments? If that is right, then it seems that our justifying reasons for holding to a judgment can be sensitive to at least some discoveries about the explanatory reasons for them. (For related arguments, see Street 2006 and Joyce 2006.)

b. The Expertise Defense

Another objection claims to protect the judgments used in moral theory-making even while allowing the in-principle relevance of cognitive scientific findings. The claim is this: cognitive science uses research subjects who are not experts in making moral judgments. But moral philosophers have years of training at drawing careful distinctions, and also typically have much more time than research subjects to think carefully about their judgments. So even if ordinary participants in cognitive science studies make mistakes due to psychological quirks, we should not assume that the judgments of experts will resemble those of non-experts. We do not doubt the competence of expert mathematicians simply because the rest of us make arithmetic mistakes (Ludwig 2007). So, the objection runs, if it is plausible to think of moral philosophers as experts, then moral philosophers can continue to rely upon their judgments whatever the cognitive science says about the judgments of non-experts.

Is this expertise defense plausible? One major problem is that it does not appear to be well supported by empirical evidence. In a few studies (Schwitzgebel and Cushman 2012; Tobia, Buckwalter, and Stich 2013), people with doctorates in moral philosophy have been subjected to the same psychological tests as non-expert subjects and appear to make similar mistakes. There is some dispute about how to interpret these studies (Rini 2015), but if they hold up then it will be hard to defend the moral judgments of philosophers on grounds of expertise.

c. The Regress Challenge

A final objection comes in the form of a regress challenge. Henry Sidgwick first made the point, in his Methods of Ethics, that it would be self-defeating to attempt to debunk judgments on the grounds of their causal origins. The debunking itself would rely on some judgments for its plausibility, and we would then be led down an infinite regress in querying the causal origins of these judgments, and causal origins of the judgments responsible for our judgments about those first origins, and so on. Sidgwick seems to be discussing general moral skepticism, but a variant of this argument presents a regress challenge to even selective cognitive scientific debunking of particular moral judgments. According to the objection, once we have opened the door to debunking, we will be drawn into an inescapable spiral of producing and challenging judgments about the moral trustworthiness of various causal origins. Perhaps, then, we should not start on this project at all.

This objection is limited in effect; it applies most obviously to epistemic forms of cognitive scientific debunking. In non-epistemic conceptions of the aims of moral judgment, it may be possible to ignore some of the objection’s force. The objection is also dependent upon certain empirical assumptions about the interdependence of the causal origins driving various moral judgments. But if sustained, the regress challenge for epistemic debunking seems significant.

d. Positive Alternatives

Finally, we might consider an alternative take on the relationship between moral judgment and cognitive science. Unlike most of the approaches discussed above, this one is positive rather than negative. The idea is this: if cognitive science can reveal to us that we already unconsciously accept certain moral principles, and if these fit with the judgments we think we have good reason to continue to hold, then cognitive science may be able to contribute to the construction of moral theory. Cognitive science might help you to explicitly articulate a moral principle that you already accepted implicitly (Mikhail 2011; Kahane 2013). In a sense, this is simply scientific assistance to the traditional philosophical project of making explicit the moral commitments we already hold—the method of reflective equilibrium developed by Rawls and employed by most contemporary ethicists. In this view, the use of cognitive science is likely to be less revolutionary, but still quite important. Though negative approaches have received most discussion, the positive approach seems to be an interesting direction for future research.

11. References and Further Reading

  • Berker, Selim. 2009. “The Normative Insignificance of Neuroscience.” Philosophy & Public Affairs 37 (4): 293–329. doi:10.1111/j.1088-4963.2009.01164.x.
  • Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
  • Cosmides, Leda, and John Tooby. 1992. “Cognitive Adaptations for Social Exchange.” In The Adapted Mind: Evolutionary Psychology and the Generation of Culture, edited by J. Barkow, Leda Cosmides, and Tooby, 163–228. New York: Oxford University Press.
  • Crockett, Molly J., Luke Clark, Marc D. Hauser, and Trevor W. Robbins. 2010. “Serotonin Selectively Influences Moral Judgment and Behavior through Effects on Harm Aversion.” Proceedings of the National Academy of Sciences of the United States of America 107 (40): 17433–38. doi:10.1073/pnas.1009396107.
  • Danziger, Shai, Jonathan Levav, and Liora Avnaim-Pesso. 2011. “Extraneous Factors in Judicial Decisions.” Proceedings of the National Academy of Sciences 108 (17): 6889–92. doi:10.1073/pnas.1018033108.
  • Dworkin, Ronald. 1996. “Objectivity and Truth: You’d Better Believe It.” Philosophy & Public Affairs 25 (2): 87–139.
  • Dwyer, Susan. 2006. “How Good Is the Linguistic Analogy?” In The Innate Mind: Culture and Cognition, edited by Peter Carruthers, Stephen Laurence, and Stephen Stich. Oxford: Oxford University Press.
  • Eskine, Kendall J, Natalie A Kacinik, and Jesse J Prinz. 2011. “A Bad Taste in the Mouth: Gustatory Disgust Influences Moral Judgment.” Psychological Science 22 (3): 295–99. doi:10.1177/0956797611398497.
  • Fried, Charles. 1978. “Biology and Ethics: Normative Implications.” In Morality as a Biological Phenomenon: The Presuppositions of Sociobiological Research, 187–97. Berkeley, CA: University of California Press.
  • Gilligan, Carol. 1982. In a Different Voice: Psychology Theory and Women’s Development. Cambridge, MA: Harvard University Press.
  • Greene, Joshua D. 2008. “The Secret Joke of Kant’s Soul.” In Moral Psychology, Vol. 3. The Neuroscience of Morality: Emotion, Brain Disorders, and Development, edited by Walter Sinnott-Armstrong, 35–80. Cambridge, MA: MIT Press.
  • Greene, Joshua D. 2014. “Beyond Point-and-Shoot Morality: Why Cognitive (Neuro)Science Matters for Ethics.” Ethics 124 (4): 695–726. doi:10.1086/675875.
  • Haidt, Jonathan. 2001. “The Emotional Dog and Its Rational Tail: A Social Intuitionist Approach to Moral Judgment.” Psychological Review 108 (4): 814–34.
  • Haidt, Jonathan. 2012. The Righteous Mind: Why Good People Are Divided by Politics and Religion. 1st ed. New York: Pantheon.
  • Hamlin, J. Kiley, Karen Wynn, and Paul Bloom. 2007. “Social Evaluation by Preverbal Infants.” Nature 450 (7169): 557–59. doi:10.1038/nature06288.
  • Harris, Sam. 2010. The Moral Landscape: How Science Can Determine Human Values. First Edition. New York: Free Press.
  • Hauser, Marc D. 2006. “The Liver and the Moral Organ.” Social Cognitive and Affective Neuroscience 1 (3): 214–20. doi:10.1093/scan/nsl026.
  • Joyce, Richard. 2006. The Evolution of Morality. 1st ed. MIT Press.
  • Kahane, Guy. 2012. “On the Wrong Track: Process and Content in Moral Psychology.” Mind & Language 27 (5): 519–45. doi:10.1111/mila.12001.
  • Kahane, Guy. 2013. “The Armchair and the Trolley: An Argument for Experimental Ethics.” Philosophical Studies 162 (2): 421–45. doi:10.1007/s11098-011-9775-5.
  • Kamm, F. M. 1998. “Moral Intuitions, Cognitive Psychology, and the Harming-Versus-Not-Aiding Distinction.” Ethics 108 (3): 463–88.
  • Kennett, Jeanette, and Cordelia Fine. 2009. “Will the Real Moral Judgment Please Stand up? The Implications of Social Intuitionist Models of Cognition for Meta-Ethics and Moral Psychology.” Ethical Theory and Moral Practice 12 (1): 77–96.
  • Knobe, Joshua. 2003. “Intentional Action and Side Effects in Ordinary Language.” Analysis 63 (279): 190–94. doi:10.1111/1467-8284.00419.
  • Koenigs, Michael, Liane Young, Ralph Adolphs, Daniel Tranel, Fiery Cushman, Marc Hauser, and Antonio Damasio. 2007. “Damage to the Prefrontal Cortex Increases Utilitarian Moral Judgements.” Nature 446 (7138): 908–11. doi:10.1038/nature05631.
  • Kohlberg, Lawrence. 1971. “From ‘Is’ to ‘Ought’: How to Commit the Naturalistic Fallacy and Get Away with It in the Study of Moral Development.” In Cognitive Development and Epistemology, edited by Theodore Mischel. New York: Academic Press.
  • Ludwig, Kirk. 2007. “The Epistemology of Thought Experiments: First Person versus Third Person Approaches.” Midwest Studies In Philosophy 31 (1): 128–59. doi:10.1111/j.1475-4975.2007.00160.x.
  • Mason, Kelby. 2011. “Moral Psychology And Moral Intuition: A Pox On All Your Houses.” Australasian Journal of Philosophy 89 (3): 441–58. doi:10.1080/00048402.2010.506515.
  • Mikhail, John. 2011. Elements of Moral Cognition: Rawls’ Linguistic Analogy and the Cognitive Science of Moral and Legal Judgment. 3rd ed. Cambridge: Cambridge University Press.
  • Moll, Jorge, Roland Zahn, Ricardo de Oliveira-Souza, Frank Krueger, and Jordan Grafman. 2005. “The Neural Basis of Human Moral Cognition.” Nature Reviews Neuroscience 6 (10): 799–809. doi:10.1038/nrn1768.
  • Nadelhoffer, Thomas, and Adam Feltz. 2008. “The Actor–Observer Bias and Moral Intuitions: Adding Fuel to Sinnott-Armstrong’s Fire.” Neuroethics 1 (2): 133–44.
  • Nagel, Thomas. 1978. “Ethics as an Autonomous Theoretical Subject.” In Morality as a Biological Phenomenon: The Presuppositions of Sociobiological Research, edited by Gunther S. Stent, 198–205. Berkeley, CA: University of California Press.
  • Newman, George E., Paul Bloom, and Joshua Knobe. 2014. “Value Judgments and the True Self.” Personality and Social Psychology Bulletin 40 (2): 203–16. doi:10.1177/0146167213508791.
  • Nichols, Shaun, and Joshua Knobe. 2007. “Moral Responsibility and Determinism: The Cognitive Science of Folk Intuitions.” Noûs 41 (4): 663–85. doi:10.1111/j.1468-0068.2007.00666.x.
  • Rawls, John. 1971. A Theory of Justice. 1st ed. Cambridge, MA: Harvard University Press.
  • Rini, Regina A. 2013. “Making Psychology Normatively Significant.” The Journal of Ethics 17 (3): 257–74. doi:10.1007/s10892-013-9145-y.
  • Rini, Regina A. 2015. “How Not to Test for Philosophical Expertise.” Synthese 192 (2): 431–52.
  • Ruff, C. C., G. Ugazio, and E. Fehr. 2013. “Changing Social Norm Compliance with Noninvasive Brain Stimulation.” Science 342 (6157): 482–84. doi:10.1126/science.1241399.
  • Schnall, Simone, Jonathan Haidt, Gerald L. Clore, and Alexander H. Jordan. 2008. “Disgust as Embodied Moral Judgment.” Personality & Social Psychology Bulletin 34 (8): 1096–1109. doi:10.1177/0146167208317771.
  • Schwitzgebel, Eric, and Fiery Cushman. 2012. “Expertise in Moral Reasoning? Order Effects on Moral Judgment in Professional Philosophers and Non-Philosophers.” Mind and Language 27 (2): 135–53.
  • Sinnott-Armstrong, Walter. 2008. “Framing Moral Intuition.” In Moral Psychology, Vol 2. The Cognitive Science of Morality: Intuition and Diversity, 47–76. Cambridge, MA: MIT Press.
  • Skinner, B. F. 1971. Beyond Freedom and Dignity. New York: Knopf.
  • Street, Sharon. 2006. “A Darwinian Dilemma for Realist Theories of Value.” Philosophical Studies 127 (1): 109–66.
  • Strohminger, Nina, Richard L Lewis, and David E Meyer. 2011. “Divergent Effects of Different Positive Emotions on Moral Judgment.” Cognition 119 (2): 295–300. doi:10.1016/j.cognition.2010.12.012.
  • Sunstein, Cass R. 2005. “Moral Heuristics.” Behavioral and Brain Sciences 28 (4): 531–42.
  • Terbeck, Sylvia, Guy Kahane, Sarah McTavish, Julian Savulescu, Neil Levy, Miles Hewstone, and Philip Cowen. 2013. “Beta Adrenergic Blockade Reduces Utilitarian Judgement.” Biological Psychology 92 (2): 323–28.
  • Tobia, Kevin, Wesley Buckwalter, and Stephen Stich. 2013. “Moral Intuitions: Are Philosophers Experts?” Philosophical Psychology 26 (5): 629–38. doi:10.1080/09515089.2012.696327.
  • Tversky, A., and D. Kahneman. 1981. “The Framing of Decisions and the Psychology of Choice.” Science 211 (4481): 453–58. doi:10.1126/science.7455683.
  • Van Roojen, Mark. 1999. “Reflective Moral Equilibrium and Psychological Theory.” Ethics 109 (4): 846–57.
  • Wheatley, Thalia, and Jonathan Haidt. 2005. “Hypnotic Disgust Makes Moral Judgments More Severe.” Psychological Science 16 (10): 780–84. doi:10.1111/j.1467-9280.2005.01614.x.
  • Wilson, E. O. 1975. Sociobiology: The New Synthesis. Cambridge, MA: Harvard University Press.
  • Yang, Qing, Xiaochang Wu, Xinyue Zhou, Nicole L. Mead, Kathleen D. Vohs, and Roy F. Baumeister. 2013. “Diverging Effects of Clean versus Dirty Money on Attitudes, Values, and Interpersonal Behavior.” Journal of Personality and Social Psychology 104 (3): 473–89. doi:10.1037/a0030596.
  • Young, Liane, Joan Albert Camprodon, Marc Hauser, Alvaro Pascual-Leone, and Rebecca Saxe. 2010. “Disruption of the Right Temporoparietal Junction with Transcranial Magnetic Stimulation Reduces the Role of Beliefs in Moral Judgments.” Proceedings of the National Academy of Sciences 107 (15): 6753–58. doi:10.1073/pnas.0914826107.

 

Author Information

Regina A. Rini
Email: gina.rini@nyu.edu
New York University
U. S. A.

Theories of Religious Diversity

religious symbolsReligious diversity is the fact that there are significant differences in religious belief and practice. It has always been recognized by people outside the smallest and most isolated communities. But since early modern times, increasing information from travel, publishing, and emigration have forced thoughtful people to reflect more deeply on religious diversity. Roughly, pluralistic approaches to religious diversity say that, within bounds, one religion is as good as any other. In contrast, exclusivist approaches say that only one religion is uniquely valuable. Finally, inclusivist theories try to steer a middle course by agreeing with exclusivism that one religion has the most value while also agreeing with pluralism that others still have significant religious value.

What values are at issue? Literature since 1950 focuses on the truth or rationality of religious teachings, the veridicality (conformity with reality) of religious experiences, salvific efficacy (the ability to deliver whatever cure religion should provide), and alleged directedness towards one and the same ultimate religious object.

The exclusivist-inclusivist-pluralist trichotomy has become standard since the 1980s. Unfortunately, it is often used with some mix of the above values in mind, leaving it unclear exactly which values are pertinent. While this trichotomy is sometimes thought of in terms of general attitudes that a religious person may have towards other religions—approximately the attitudes of rejection, limited openness, and wide acceptance respectively—in this article they figure as theories concerning the facts of religious diversity. “Religious pluralism” in some contexts means an informed, tolerant, and appreciative or sympathetic view of the various religions. In other contexts, “religious pluralism” is a normative principle requiring that peoples of all or most religions should be treated the same.  In this article, “religious pluralism” refers to a theory about the diversity of religions. Finally, some authors use “descriptive religious pluralism” to mean what is here called “religious diversity,” calling “normative religious pluralism” views that are here called varieties of “religious pluralism.” While the trichotomy has been repeatedly challenged, it is still widely used, and can be precisely defined in various ways.

Table of Contents

  1. Facts and Theories of Religious Diversity
    1. History
    2. Theories and Associations
  2. Religious Pluralism
    1. Naive Pluralisms
    2. Core Pluralisms
    3. Hindu Pluralisms
    4. Ultimist Pluralisms
    5. Identist Pluralisms
    6. Expedient Means Pluralism
  3. Exclusivism
    1. Terminological Problems
    2. Naive Exclusivisms
    3. Christian Exclusivisms
  4. Inclusivism
    1. Terminological Problems
    2. Abrahamic Inclusivisms
    3. Buddhist Inclusivisms
    4. Plural Salvations Inclusivism
  5. References and Further Reading

1. Facts and Theories of Religious Diversity

Scholars distinguish seven aspects of religious traditions: the doctrinal and philosophical, the mythic and narrative, the ethical and legal, the ritual and practical, the experiential and emotional, the social and organizational, and the material and artistic. (Smart 1996) Religious traditions differ along all these dimensions. These are the undisputed facts of religious diversity. Some authors, usually ones who wish to celebrate these facts, call them “religious pluralism,” but this entry reserves this label for a family of theories about the facts of religious diversity.

It is arguably the doctrinal and philosophical aspects of a religion which are foundational, in that the other aspects can only be understood in light of them. Particularly central to any religion’s teaching are its diagnosis of the fundamental problem facing human beings and its suggested cure, a way to positively and permanently resolve this problem. (Prothero 2010, 13-6; Yandell 1999, 16-9)

a. History

Scholarly study of a wide range of religions, and comparison and evaluation of them, was to a large extent pioneered by Christian missionaries in the nineteenth century seeking to understand those whom they sought to convert. This led to both the questioning and the defense of various “exclusivist” traditional Christian claims. (Netland 2001, 23-54) Theories of religious diversity have largely been driven by attacks on and defenses of such claims, and discussions continue within the realm of Christian theology. (Kärkkäinen 2003; Netland 2001) The most famous of these has been the view (held by some Christians) that all non-Christians are doomed to an eternity of conscious suffering in hell. (Hick 1982 ch. 2; see section 3c below)

All the theories discussed in this article are ways that (usually religious) people regard other religions, but here we discuss them abstractly, without descending much into the details of how they would be worked out in the teachings and practices of any one religion. Such would be the work of a religiously embedded and committed theology of religious diversity, not of a general philosophy of religious diversity.

b. Theories and Associations    

Many people associate any sort of pluralist theory of religious diversity with a number of arguably good qualities. These qualities include but are not limited to: being humble, reasonable, kind, broad-minded, open-minded, informed, cosmopolitan, modern, properly appreciative of difference, non-bigoted, tolerant, being opposed to proselytizing (attempts to convince those outside the religion to convert to it), anti-colonialist, and anti-imperialist. In contrast, any non-pluralist theory of religious diversity is associated with many arguably bad qualities. These negative qualities include but are not limited to: being arrogant, unreasonable, mean, narrow-minded, closed-minded, uninformed, provincial, out of date, xenophobic, bigoted, intolerant, in favor of proselytism, colonialist, and imperialist.

These, however, are mere associations; there seems to be no obvious entailments between the theories of religious diversity and the above qualities. In principle, it would seem that an exclusivist or inclusivist may have all or most of the good qualities, and one who accepts a theory of religious pluralism may have all or most of the bad qualities. These connections between theory and character – which are believed by some to provide practical arguments for or objections to various theories – need to be argued for. But it is very rare for a scholar to go beyond merely assuming or asserting some sort of causal connection between the various theories about religious diversity and the above virtues and vices.

2. Religious Pluralism

A theory of religious pluralism says that all religions of some kind are the same in some valuable respect(s). While this is compatible with some religion being the best in some other respect(s), the theorists using this label have in mind that many religions are equal regarding the central value(s) of religion. (Legenhausen 2009)

The term “religious pluralism” is almost always used for a theory asserting positive value for many or most religions. But one may talk also of “negative religious pluralism” in which most or all religions have little or no positive value and are equal in this respect. This would be the view of many naturalists, who hold that all religions are the product of human imagination, and fail to have most or all of the values claimed for them. (Byrne 2004; Feuerbach 1967)

a. Naive Pluralisms

Though naive pluralisms are not common amongst scholars in relevant fields, they are important to mention because they are entertained by many people as they begin to reflect on religious diversity.

An uninformed person, noting certain commonalities of religious belief and practice, may suppose that all religions are the same, namely, that there are no significant differences between religious traditions. This naive pluralism is refuted by accurate information on religious differences. (Prothero 2010)

A common form of negative pluralism may be called “verificationist pluralism.” This is the view that all religious claims are meaningless, and as a result are incapable of rational evaluation.  This is because they cannot be empirically verified, that is, their truth or falsity is not known by way of observational evidence.

There are three serious problems with verificationist pluralism. First, some religious claims can be empirically confirmed or disconfirmed. For example, people have empirically disconfirmed claims that Jesus will visibly return to rule the earth from Jerusalem in 1974, or that magical “ghost shirts” will protect the wearer from bullets, or that saying a certain mantra three times will protect one from highway robbers. Second, the claim that meaningfulness requires the possibility of empirical verification has little to recommend it, and is self-refuting (that is, the claim itself is not empirically verifiable). (Peterson et. al. 2013, 268-72) Third, religions differ in how much, if at all, they make empirically verifiable claims, so it is unclear that all religions will be equal in making meaningless claims.

While there are other sorts of negative naive pluralism, we shall concentrate on positive kinds here, as most of the scholarly literature focuses on those.

Some forms of naive pluralism suppose that all religions will turn out to be complementary. One idea is that all religions would turn out to be parts of one whole (either one religion or at least one conglomeration of religions). This unified consistency may be hoped for in terms of truth, or in terms of practice. With truth, the problem is that it is hard to see how the core claims of the religions could all be true. For instance, some religions teach that the ultimate reality (the most real, only real, or primary thing) is ineffable (such that no human concept can apply to it). But others teach that the ultimate reality is a perfect self, a being capable of knowledge, will, and intentional action.

What about the religions’ practices – are they all complementary? Some practices seem compatible, such as church attendance and mindfulness meditation. On the other hand, others seem to make little or no sense outside the context of the home religion, and others are simply incompatible. What sense, for instance, would it make for a Zen Buddhist to undergo the Catholic rites of confession and penance? Or what sense would it make for an Orthodox Jew, whose religion teaches him to “be fruitful and multiply,” to employ the Buddhist practice of viewing corpses at a burial ground so as to expunge the unwanted liability of sexual desire? Nor can he be fruitful and multiply while living as a celibate Buddhist monk. Dabblers and hobbyists freely stitch together unique quilts of religious beliefs and practices, but such constructions seem to make little sense once a believer has accepted any particular religion. Many religious claims will be logically incompatible with the accepted diagnosis, and many religious practices will be useless or counter-productive when it comes to getting what one believes to be the cure.

Another way in which pluralism can be naive is the common assumption that absolutely all religions are good in significant ways, for example, by improving their adherents’ lives, facilitating interaction with God, or leading to eternal salvation. However, such a person is probably only thinking of large, respectable, and historically important religions. It is not hard to find religions or “religious cults” which would not plausibly be thought of as good in the way(s) that a pluralist has in mind. For example, a religious group may function only to satisfy the desires of its founder, discourage the worship of God, encourage the sexual abuse of children, or lead to the damnation of its members.

Carefully worked out theories of religious pluralism often sound all-inclusive. However, they nearly always have at least one criterion for excluding religions as inferior in the aspect(s) they focus on. A difficulty for any pluralist theory is how to restrict the group of equally good religions without losing the appearance of being all-accepting or wholly non-judgmental. A common strategy here is to simply ignore disreputable religious traditions, only discussing the prestigious ones.

b. Core Pluralisms

An improvement upon naive pluralism acknowledges differences in all the aspects of religions, but separates peripheral from core differences. A core pluralist claims that all religions of some kind share a common core, and that this is really what matters about the religions; their equal value is found in this common core. If the core is true teachings, they’ll all be true (to some degree). If the core is veridical experiences, all religions will enable ways to perceive whatever the objects of religious experience are. If the core is salvifically effective practice, then all will be equal in that each is equally well a means to obtaining the cure. Given that any core pluralism inevitably downplays the other non-core elements of the religions, this approach has also been called “reductive pluralism.” (Legenhausen 2006)

The most influential recent proponent of a version of core pluralism has been Huston Smith. (b. 1919) In his view, the common core of religions is a tiered worldview.  This encompasses the idea that physical reality, the terrestrial plane, is contained within and controlled by a more real intermediate plane (that is, the subtle, animic, or psychic plane) which is in turn contained and controlled by the celestial plane. This celestial plane is a personal God. Beyond this is infinite, unlimited Being (also called “Absolute Truth, “the True Reality,” “the Absolute,” “God”). Given that it is ineffable, this Being is neither a god, nor the God of monotheism. It is more real than all that comes from it. The various “planes” are not distinct from it, and it is the ultimate object of all desire, and the deepest reality within each human self. Some experience this Being as if it were a god, but the most able gain a non-conceptual awareness of it in its ineffable glory. Smith holds that in former ages, and among primitive peoples now, such a worldview is near universal.  It is only modern people who are blinded by the misunderstanding that science reveals all, who have forgotten it. (Smith 1992, 2003 ch. 3) The highest level in some sense is the human “Spirit,” the deep self which underlies the self of ordinary experience. Appropriating Hindu, Buddhist, and Christian language, Smith says that this “spirit is the Atman that is Brahman, the Buddha-nature that appears when our finite selves get out of its way, my istigkeit (is-ness) which…we see is God’s is-ness.” (Smith 2003 ch. 3, 3-4)

Such an outlook, often called “the perennial philosophy” or “traditionalism,” owes much to nineteenth and twentieth century occult literature, and to neo-Platonism and its early modern revivals. (Sedgwick 2004) Like traditional religions, it too offers a diagnosis of the human condition and a cure.  It offers a fall from primordial spirituality into modern spiritual poverty, cured by adopting the outlook sketched above. Most importantly, it offers a chance to discover the deep self as Being. A muted ally in this was the influential religious scholar Mircea Eliade (1907-86), whose work focused on comparing mythologies, and on what he viewed as an important, primitive religious outlook, which separates things into the sacred and the profane.

This “perennial philosophy” appeals to many present-day people, particularly those who, like Smith, have moved from a childhood religious faith, in his case, Christianity, to a more naturalistic and, hence, atheistic outlook. Such an outlook is commonly perceived as meaningless, hopeless, and devoid of value. (Smith 2003, 2012)

Dissenters are found among historians of religion, who deny that there is and has always been a common core in all of the world’s religions. Others dissent because they accept the incompatible diagnosis and cure taught by some other religion, such as the ones found in Islam or Christianity. (Legenhausen 2002) Those who believe the ultimate reality to be a unique god object to Smith’s view that the ultimate reality is ineffable, and so not, in itself, a god. Others find it excessive that Smith accepts other traditional doctrines, such as that Plato’s Forms are not only real but alive, that in dreaming the “subtle body” leaves the body and roams free in the intermediate realm, belief in siddhis (supernatural powers gained by meditation), possession by spirits, psychic phenomena, and so on.

This sort of core pluralism was propounded by some members of the Theosophical Society such as co-founder Helena Petrovna Blavatsky (1831-91) in her widely read The Secret Doctrine (1888), and by French convert to Sufi Islam René Guénon (1886-1951) and those influenced by him, such as the eclectic Swiss-German writer Frithjof Schuon (1907-98). This sort of pluralism, following Guénon and Schuon, has been championed by Iranian philosopher Seyyed Hossein Nasr (b. 1933) and English convert to Islam, Martin Lings (1909-2005) who was a biographer of Muhammad. (Legenhausen 2002)

While Smith’s view rests on belief in an impersonal Ultimate, other versions of core pluralism rest upon monotheism. Thus, the Hindu intellectual Ram Mohun Roy (1772/4-1833) held that Hinduism, Islam, and Christianity, when understood in their original, non-idolatrous and non-superstitious ways, all teach the important truths about God and humankind, enabling humans to love and serve God. Roy, however, always retained his Hindu and Brahmin identities. (Sharma 1999, ch. 2) He was what we now call a “pluralistic Hindu” (and most Hindus would add that he was also an unorthodox Hindu).

Swedish philosopher Mikael Stenmark explores what he calls the “some-are-equally-right view” about religious diversity, and discusses a version of it on which Judaism, Christianity, and Islam are held to possess equal amounts of religiously important truths. (Stenmark 2009) He does not advocate this view, but explores it as an alternative to exclusivism, inclusivism, and Hickian identist pluralism. Stenmark views it as most similar to identist pluralism (see 2e below). But Stenmark’s “some-are-equally-right view” can also be seen as a form of core pluralism, the core being truths about the one God and our need for relationship with God. On this view, “all” the religions are right to the same degree, that is, all versions of monotheism (or perhaps, ethical monotheisms, or Abrahamic monotheisms). This account is narrower than “pluralism” is usually thought to be, but it is arguably a version of it.

c. Hindu Pluralisms

The tradition now called “Hinduism” is and always has been very internally diverse. In modern times, it tries to equalize other religions in the same ways it equalizes the apparently contrary claims and practices internal to it. While elements within it have been sectarian and exclusivistic, modern Hindu thought is usually pluralistic. Furthermore, Hindu thought has shifted in modern times from a scriptural to an experiential emphasis. (Long 2011) Still, some Hindus object to various kinds of pluralism. (Morales 2008)

Within the pluralistic mainstream of Hinduism, a popular slogan is that “all religions are true,” but this may be an expression of almost any sort of positive religious pluralism. Moreover, some influential modern Hindu leaders have adopted a complicated rhetoric of “universal religion,” which often assumes some sort of religious pluralism. (Sharma 1999, ch. 6)

At bare minimum, the slogan that “all religions are true” means that all religions are in some way directed towards one Truth, where this is understood as an ultimate reality. Thus, it has been observed that identist religious pluralism (see 2e below) “is essentially a Hindu position,” and closely resembles Advaita Vendanta thought. (Long 2005) The slogan may also imply that all religions feature veridical experience of that one object, by way of a non-cognitive, immediate awareness. (Sharma 1990) This modern Hindu outlook has proven difficult to formulate in any clear way. One prominent scholar argues that the “neo-Hindu” position on religious diversity (that is, modern Hindu pluralism) is not the view that all religions are equal, one, true, or the same. Instead, it is the view that all religions are “valid,” meaning that they have some degree of (some kind of) value. (Sharma 1979) But if there is no one clear modern Hindu pluralism, it remains that various modern Indian thinkers have held to versions of core or identist pluralism.

Paradoxically, such pluralism is often expressed along with claims that Hinduism is greatly superior in various ways to other religions. (Datta 2005; Morales 2008) It has been argued that whether and how a Hindu embraces a theory of religious pluralism will depend crucially on what she takes “Hinduism” to be. (Long 2005)

d. Ultimist Pluralisms

Building on the speculative metaphysics of Alfred North Whitehead’s (1861–1947) Process and Reality (1929), and work by his student Charles Hartshorne (1897-2000), theologians John Cobb and David Ray Griffin have advocated what the latter calls “deep,” “differential,” and “complementary” pluralism – what is here described as “ultimist.” (Cooper 2006, ch. 7; Griffin 2005b)

Cobb and Griffin assume that there is no supernatural intervention (any miraculous interruption of the ordinary course of nature) by God or other beings. This, it is hoped, rules out anyone having grounds for believing any particular religion to be the uniquely best religion. (Griffin 2005a) They do, however, take seriously at least many of the unusual religious experiences people report. They hypothesize that some religious mystics really do perceive (without using ordinary sense organs) some “ultimate” (that is, something regarded as ultimate). Thus, in experiencing what they call “Emptiness” or the Dharmakaya (truth body), Mahayana Buddhists really do perceive what Cobb calls Creativity (or Being Itself), as do Advaita Vedanta Hindus when they perceive Nirguna Brahman (Brahman without qualities). Other Buddhists experience Amida Buddha, while Christians experience Christ, and Jews Yahweh, Hindus Isvara, and Muslims Allah. All such religious mystics really perceive a personal, supreme God, understood panentheistically, as being “in” the cosmos, akin to how a soul is in a body. Yet other religious mystics perceive the Cosmos, that is, the totality of finite things (the “body” of the World Soul).

These three – Creativity, God, Cosmos – are such that none could exist without the others. Further, it is really Creativity that is ultimate, and it is “embodied in” and does not exist apart from or as an object in addition to God and the Cosmos. Sometimes God and Cosmos are described as aspects of Creativity. The underlying metaphysics here is that of process philosophy, in which events are the basic or fundamental units of reality. On such a metaphysics, any apparent substance (being, entity) turns out to be one or more events or processes. Even God, the greatest concrete, actual being in this philosophy is, in the end, an all-encompassing “unity of experience,” and is to be understood as a process of “Creative-Responsive love.” (Griffin 2005b; Cooper 2006, ch. 7)

All the major religions, then, are really oriented towards, and involve the experience of some reality regarded as “ultimate” (Creativity, God, or Cosmos). It is also allowed that each major religion really does deliver the cure it claims to (for example, salvation and heaven, Nirvana, Moksha), and is entitled to operate by its own moral and epistemic values. Further, it respects and does not try to eliminate all these differences, and so makes genuine dialogue between members of the religions possible. Finally, Cobb and Griffin emphasize that this approach does not endorse any unreasonable form of relativism and, as such, allows one to remain distinctively Christian or Buddhist and so forth. They hope that each religion can, while remaining distinct, begin to construct “global” theologies, influenced by the truths and values of other religions. In all these ways, they argue that their ultimist pluralism is superior to other pluralisms.

This view has not been widely accepted because the Process theology and philosophy on which it is based has not been widely accepted.

One may object that this above proposal is counter to the equalizing spirit of pluralism. Griffin and Cobb seem to attribute the deepest insight to those who think the ultimate reality is an impersonal, indescribable non-thing. In their view, those who confess experience of Emptiness, Nirguna Brahman, or the One (of Neoplatonism) behold the ultimate reality (Creativity) as it really is, in contrast to monotheists or cosmos-focused religionists, who latch on to what are limited aspects of Creativity. But these monotheists and cosmos-worshipers each take their object to be ultimate, and would deny the existence of any further back entity or non-entity, that is, of Creativity. It would seem that that, for example, a Christian to accept this ultimist pluralism, she will have to reinterpret what many Christians will regard as a core commitment, namely, that the ultimate reality is personal. Even a Mahayana Buddhist may have a lot of adjusting to do, if she is to admit that believers in a personal God really do experience the greatest entity, and something which is not separate from Emptiness. Wouldn’t this be to attribute more reality to God than she’s willing to? And how can the ultimist pluralist demand such changes?

A similar pluralism is advanced by Japanese Zen scholar Masao Abe (1915-2006). He applies the Mahayana doctrine of the “three bodies” of the Buddha to other religions. In Mahayana Buddhism, the ultimate reality, a formless but active non-thing, is Emptiness, or the Truth Body (Dharmakaya). This in some sense manifests as, acts as, and is not different from a host of Enjoyment Bodies (Sambhogakaya), each of which is a Buddha outside of space and time, a historical Buddha now escaped from samsara and dwelling in a Buddha-realm. The historical Buddha, the man Gautama is, in this doctrine, a Transformation Body (or Apparitional Body, Nirmanakaya) of one of these, as are other Buddhas in time and space. In some sense these three are one, however, the Truth Body manifests or acts as various Enjoyment Bodies, which in turn manifest or act as various Transformation Bodies. The latter two classes of beings, but not the first, may be described as “personal.”

As to religious diversity, Abe suggests that we view the dynamic activity Emptiness (also called “Openness”) as ultimate, and as manifesting as various “Gods,” that is various monotheistic deities, and “Lords,” which are human religious teachers, whether manifestations of a god, as in the case of Jesus, or just pre-eminent servants of a god, as with Moses or Muhammad.

It is a mistake, Abe holds, to regard any god as ultimate, and monotheists must revise their understanding as above, if true inter-religious dialogue and peace are to be achievable. Equally, Advaita Vedanta Hindus must let go of their insistence on Nirguna Brahman as ultimate. It is a mistake to think that the ultimate is any substantial, self-identical thing, even an ineffable one. (Abe 1985)

Must the Mahayanist make any significant revision to accept the proposed “threefold reality” of Emptiness, gods, and lords? Presumably not, as she already believes in levels of truth and levels of reality. At the highest level there is only Emptiness, the ultimate. The gods and lords will stay at the “provisional” levels of truth and reality, the levels which a fully enlightened person, as it were, sees beyond.

Abe’s views have been criticized by other scholars as misunderstanding some other religions’ claims, and as privileging Mahayana Buddhist doctrines, insofar as he understands these doctrines as being truer than others. It can be argued that Abe is an inclusivist, maintaining that Buddhism is the best religion, rather than a true pluralist. (Burton 2010)

e. Identist Pluralisms

In much religious studies, theology, and philosophy of religion literature of the 1980s through the 2000s, the term “religious pluralism” means the theory of philosopher John Hick. (Hick 2004; Quinn and Meeker 2000) Hick’s approach is original, thorough, and informed by a broad and deep knowledge of many religions. His theory is at least superficially clear, and is rooted in his own spiritual journey. It attracted widespread discussion and criticism, and Hick has engaged in a spirited debate with all comers. It is here described as “Identist” pluralism because his theory claims that people in all the major religions interact with one and the same transcendent reality, variously called “God,” “the Real,” and “the Ultimate Reality.”

Hick viewed religious belief as rationally defensible, and held that one may be rational in trusting the veridicality of one’s religious experiences. However, he thought that it is arbitrary and indefensible to hold that only one’s own experiences or the experiences of those in one’s group are veridical, while those of people in other religions are not. Subjectively, those other people have similar grounds for belief. These ideas, and the fact that religious belief is strongly correlated with birthplace, convinced him that the facts of religious diversity pose irresolvable problems for any exclusivist or inclusivist view, leaving only some form of pluralism as a viable option. (Hick 1997)

Starting as a traditional, non-pluralistic Christian, Hick attended religious meetings and studied with people of other religions. As a result, he became convinced that basically the same thing was going on with these others religious followers as with Christians, namely, people responding in culturally conditioned but transformative ways to one and the same Real or Ultimate Reality. In his earlier writings, monotheistic concerns seem important. How could a perfect being fail to be available to all people in all the religions? Later on, Hick firmly settled on the view that this Real should be thought of as ineffable. Appropriating Immanuel Kant’s distinction between phenomena, how things appear, and noumena, things in themselves, Hick postulated that the Real is ineffable and is not directly experienced by anyone. However, he maintained that people in the religions interact with it indirectly, by way of various personae and impersonae, personal and impersonal appearances or phenomena of the Real. In other words, this Ultimate Reality, due to the various qualities of human minds, appears to various people as personae, such as God, the Trinity, Allah, Vishnu, and also as impersonae such as Emptiness, Nirvana, Nirguna Brahman, and the Dao. These objects of religious experience are mind-dependent, in that they depend for their existence, in part, on people with certain religious backgrounds. By contrast, the Real in itself, that is, the Ultimate Reality as it intrinsically is, is never experienced by anyone, but is only hypothesized as overall the most reasonable explanation of the facts of religious diversity.

Among these purported facts, for Hick, is that the great religions equally well facilitate the ethical transformation of their adherents, what Hick calls a transformation from self-centeredness to other-centeredness and Reality-centeredness. (Sometimes, however, Hick makes the weaker claim that we’re unable to pick any religion as best at effecting this transformation.) This transformation, Hick theorizes, is really the point of religion. All religions, then, are equal in that they are responses to the ineffable Ultimate Reality which equally well—or for all we can tell equally well—bring about an ethical improvement in humans, away from self-centeredness and towards other humans and the Ultimate Reality.

Hick realizes the incoherence of dubbing all religions “true,” for they have core teachings that conflict, and most religions are not shy about pointing out such conflicts. We could loosely paraphrase Hick’s view as being that all religions are false, not that all their teachings are false (for there is much ethical and practical agreement among them), but rather that their core claims about the main object of religious experience and pursuit equally contain falsehoods. Monotheists, after all, take the ultimate being to be a personal god while others, variously called ultimists, absolutists, or monists, hold the ultimate to be impersonal, such as the Dao, Emptiness, Nirguna Brahman, and so forth. These, Hick holds, are all mistaken; the Ultimate reality is neither personal nor impersonal. (Hick 1995, ch. 3) To say that it is either, Hick realizes, would be to hand an epistemic victory to either the monotheists or the absolutists (ultimists). This, he will not do.

Instead, Hick downgrades the importance of true belief to religion. Though not true, doctrines such as the Trinity or the Incarnation, he argues, may be interpreted to have “mythological truth,” that is, a tendency to influence people towards getting what Hick postulates is the cure offered by the religions, the ethical transformation described above.

Hick doesn’t argue for the salvific or cure-delivering equality of all religions. Rather, he only argues for the equality of what he calls “post-axial” religions – major religious traditions which have arisen since around 800 B.C.E. (Hick 2004, ch. 2-4; Meeker 2003)

Hick’s identist religious pluralism has been objected to as thoroughly as any recent theory in philosophy. Here we can survey only a few of the criticisms that have been made. (For others see Hick 2004, Introduction.)

Many have objected that Hick’s pluralism is not merely a theory about religions, but is itself a competitor in the field, offering a diagnosis and cure which disagrees with those of the world religions. It is hard to see, then, how his theory enables one to be, as Hick claimed to be, a “pluralistic Christian,” given that one has replaced the diagnosis and cure of Christianity with those of Hick’s pluralism. In reply, Hick urges that his claims are not themselves religious, but are rather about religious matters, and are, as such, philosophical.

Hick’s claim that no human concept applies to the Ultimate Reality has been criticized by many, who’ve pointed out that Hick applies these concepts to it: being one, being ultimate, being a partial cause of the impersonae and personae, and being ineffable. Moreover, it seems a necessary truth that if the concept personal doesn’t apply to the Real, then the concept non-personal must apply to it. (King, 2008; Rowe 1999; Yandell 1999) In response, Hick concedes that some concepts, “formal” ones, can be applied to the Real, while “substantial” ones cannot. He switches to the term “transcategorial,” points out historical versions of this thesis, and urges that the Real simply is not in the domain of entities to which concepts like personal and non-personal apply. His critics, he argues, are merely asserting without reason that there cannot be a transcategorial reality. (Hick 2000, 2004)

As to Hick’s idea that the correlation of birthplace and religious belief somehow undermines the rationality of religious belief, it has been pointed out that religious pluralism too is correlated with birthplace. In response to his claims that non-pluralistic religious believers are being arrogant, irrational, or arbitrary in believing that one religion (theirs) is the most true one, it has been pointed out that Hick too, as a religious pluralist, holds views which are inconsistent with many or most religions, seemingly preferring his own insights or experiences to theirs, which would, by Hick’s own lights, be just as arrogant, irrational, or arbitrary. (O’Connor 1999; King 2008; Bogardus 2013)

Others object that given the transcategoriality or ineffability of the Real, even with the above qualifications, there is no reason to think that interaction with the Real should be ethically beneficial, or that it should have any connection at all to any religious value. (Netland 2001)

Others object that Hick’s pluralism requires arbitrarily reinterpreting religious language non-literally, and usually as having to do with morality, contrary to what most proponents of those religions believe. (Yandell 2013)

Again, it has been objected that Hick, contrary to many religions, downgrades religious practice and belief as inessential to a religion, the only important features of a religion being that it is a response to the Ultimate Reality and that it fosters the ethical transformation noted above. Further, Hick presupposes the correctness of recent socially “liberal” ethics, for example, “sexual liberation,” and thus rules out as inessential to any religion any conflicting ethical demands. (Legenhausen 2006)

Other objections have been centered on the status of Hick’s personae. If, for example, in his view Allah, Vishnu, and Yahweh are all real and distinct, is Hick thereby committed to polytheism? Or are those gods mere fictions? (Hasker 2011) At first Hick evades the issue of polytheism by describing his theory not as a kind of “polytheism,” but rather as “poly-something.” He then suggests that two views of the personae are compatible with his theory: that they are mental projections, or that they are real, but angel- or deva– like beings, intermediaries and not really gods. Finally, Hick revises his view: the monotheistic gods people experience are mental projections in response to the Real, and not real selves, but since religious people really do encounter great selves in religious experience, we should posit personal intermediaries between humans and the Real, with whom religious people interact. Perhaps these are the angels, devas, and heavenly Buddhas of the religions, great but nonetheless finite beings. Thus Christians, for example, imagine that in prayer they interact with the ultimate, a monotheistic god, but in fact they interact with angels, and perhaps different Christians with different angels. It is a mistake, he now holds, to suppose that the personae (that is, Vishnu, Yahweh, Allah, and so on) are angel-like selves. This is not compatible with his thesis that Vishnu and others are phenomena of the Real, that is, culturally conditioned ways that the Real appears to us. (Hick 2011)

A less developed identist pluralism is explored by Peter Byrne. (1995, 2004) All the major religions are equal in that they (1) refer to and facilitate cognitive contact with a single, transcendent reality, (2) each offers a similarly moral- and eternal-oriented “cure,” and (3) each includes revisable and incomplete accounts of this transcendent reality. It has been objected that this theory is not promising because it is hard to see how we could ever have sufficient evidence for some of its claims, while others are implausible in light of the evidence we do have. (Mawson 2005)

f. Expedient Means Pluralism

Historically, Buddhist thought about other religions has almost never been pluralistic. (Burton 2010; Kiblinger 2005; and section 4c below) But in modern times, some have constructed a novel and distinctively Buddhist pluralism using the Mahayana doctrine of “expedient means” (Sanskrit: upaya). The classic discussion of this is in the Lotus Sutra (before 255 C.E.), which argues that previous versions of Buddhist teaching were mere expedient means, that is, non-truths taught because in his great wisdom, the Buddha knew that at its then immature stage, humanity would be aided only by those teachings. This was a polemic against non-Mahayana versions of Buddhist dharma (teaching). Now that the time is right, the truth may be told, that is, Mahayana doctrine, superseding the old. However, more recently, it has been argued that all religious doctrines, even Mahayana ones, are expedient means, helpful non-truths, ladders to be kicked away upon attainment of the cure, here understood as a non-cognitive awareness of the ultimate reality. (Burton 2010)

3. Exclusivism

a. Terminological Problems

The term “exclusivist” was originally a polemical term, chosen in part for its negative connotations. Some have urged that it be replaced by the more neutral terms “particularism” or “restrictivism.” (Netland 2001, 46; Kärkkäinen 2003, 80-1) This article retains the common term because it is widespread and many have adopted the label for their own theory of religious diversity.

In this article “exclusivism” about religious diversity denies any form of pluralism; it denies that all religions, or all “major ones,” are the same in some important respect. Insofar as a religion claims to possess a diagnosis of the fundamental problem facing humans and a cure, that is, a way to permanently and positively resolve this problem, it will then assume that other, incompatible diagnoses and cures are incorrect. Because of this, arguably exclusivism (or inclusivism, see section 4 below) is a default view in religious traditions. Thus, for example, the earliest Buddhist and Christian sources prominently feature staunch criticisms of various rival teachings and practices as, respectively, false and useless or harmful. (Netland 2001; Burton 2010)

Some philosophers, going against the much-discussed identist pluralism of John Hick (see 2e above) use “exclusivism” to mean reasonable and informed religious belief which is not pluralist. (O’Connor 1999) This “exclusivism” is compatible with both exclusivism and inclusivism in this article. It is difficult to make a fully clear distinction between exclusivist and inclusivist approaches. The basic idea is that the inclusivist grants more of the values in question to religions other than the single best religion – more truth, more salvific efficacy, more veridical experience of the objects of religious experience, more genuine moral transformation, and so forth.

Finally, because of their fit with many traditional religious beliefs and commitments, sometimes exclusivism and inclusivism are considered as two varieties of “confessionalism,” views on which “one religion is…true and…we must view other religions in the light of that fact.” (Byrne 2004, 203)

b. Naive Exclusivisms

An exclusivist stance is often signaled by the claim that there is “only one true religion.” Other religions, then, are “false.” A naive person may infer from this that no claim, or no central claim of any other religion is true, but all such are false. This position cannot be self-consistently maintained. Consider the claim that the cosmos was intentionally made. An informed Christian must concede that Jews and Muslims too believe this, and that they teach it as a central doctrine. Thus, if central Christian teachings are true, then so is at least one central teaching of these two rival religions.

Another naive exclusivist view which is rejected by most theorists is that all who are not full-fledged members of the best religion fail to get the cure. For example, all non-Christians go to hell, or all non-Buddhists fail to gain Nirvana, or to make progress towards it. Theorists nearly always loosen the requirement with regard to what they view as the one most true and/or most effective religion. Thus, Christian exclusivists usually allow that those who die as babies, the severely mentally handicapped, or friends of God who lived before Christian times may avoid hell and attain heaven despite their not being fully-fledged, believing and practicing Christians. (Dupuis 2001; Meeker 2003; section 3c below) Similarly, Buddhists usually allow that a person may gain positive karma, and so a better rebirth, by the practice of various other religions, helping her to advance, life by life, towards getting the cure by means of the distinctive Buddhist beliefs and practices.

While such naive exclusivist positions are rarely expounded by scholars, they frequently appear in the work of pluralists and inclusivists, held up as unfortunate, harmful, and unreasonable theories which are in urgent need of replacement.

c. Christian Exclusivisms

Early bishop Ignatius of Antioch (c. 35-107) writes that “if any follow a schismatic [that is, the founder of a religious group outside of the bishop-ruled catholic mainstream] they will not inherit the Kingdom of God.” (Letter of Ignatius to the Philadelphians 3:3) Leading catholic theologian Origen of Alexandria (c. 186-255) wrote: “outside the Church no one is saved.” (Dupuis 2001, 86-7) Yet Origen also held, at least tentatively, that eventually all rational beings will be saved.

Thus, the slogan that there is no salvation outside the church (Latin: Extra ecclesiam nulla salus) was meant to communicate at bare minimum the uniqueness of the Christian church as God’s instrument of salvation since the resurrection of Jesus. The slogan was nearly always, in the first three Christian centuries, wielded in the context of disputes with “heretical” Christian groups, the point being that one can’t be saved through membership in such groups. (Dupuis 2001, 86-9)

However, what about Jews, pagans, unbaptized babies, or people who never have a chance to hear the Christian message? After catholic Christianity became the official religion of the empire (c. 381), it was usually assumed that the message had been preached throughout the world, leaving all adult non-Christians without excuse. Thus, Augustine of Hippo (354-430) and Fulgentius of Ruspe (468-533) interpreted the slogan as implying that all non-Christians are damned, because they bear the guilt of “original sin” stemming from the sin of Adam, which has not been as it were washed away by baptism. (Dupuis 2000, 91-2)

Water baptism, from the beginning, had been the initiation rite into Christianity, but it was still unclear what church membership strictly required. Some theorized, for instance, that a “baptism of blood,” that is, martyrdom, would be enough to save unbaptized catechumens. Later theologians added a “baptism of desire,” which was either a desire to be baptized or the inclination to form such a desire, either way enough to secure saving membership in the church. In the first case, a person who is killed in an accident on her way to be baptized would nonetheless be in the church. In the second, even a virtuous pagan might be a church member. This “baptism of desire” was officially affirmed by the Roman Catholic Council of Trent in 1547.

With the split of the catholic movement into Roman Catholic and Eastern Orthodox branches, “the church” was understood in Western contexts to be specifically the Roman Catholic church. Thus, famously, in a papal bull of 1302, called by its first words Unam Sanctam (that is, “One Holy”), Pope Boniface VIII (r. 1294-1303) declared that outside the Roman Catholic church, “there is neither salvation nor remission of sins,” and “it is altogether necessary to salvation for every human creature to be subject to” the pope. (Plantinga 1999, 124-5; Neuner and Dupuis 2001, 305) Note that this might still be interpreted with or without the various non-standard ways to obtain church membership mentioned above. The context of this statement was not a discussion of the fate of non-Christians, but rather a political struggle between the pope and the king of France.

In the Decree for the Copts of the General Council of Florence (1442), a papal bull issued by pope Eugene IV (r. 1431-47), for the first time in an official Roman Catholic doctrinal document the slogan was asserted not only with respect to heretics and schismatics, but also concerning Jews and pagans. (Neuner and Dupuis 2001, 309-10) It also seems to close the door to non-standard routes to church membership, saying that “never was anyone, conceived by a man and a woman, liberated from the devil’s dominion except by faith in our Lord Jesus Christ.” (Tanner 1990, 575) Non-Catholics will “go into the everlasting fire…unless they are joined to the catholic church before the end of their lives…nobody can be saved, no matter how much he has given away in alms and even if has shed his blood in the name of Christ, unless he has persevered in the bosom and the unity of the catholic church.” (Tanner 1990, 578)

This exclusivistic or “rigorist” way of understanding the slogan, on which only the Roman Catholic church could provide the “cure” needed by all humans, was the most common Catholic stance on religious diversity until mid-nineteenth century. But some had always held on to theories about ways into the church other than water baptism, and since the European discovery of the New World it had become clear that the gospel had not been preached to the whole world, and many held that such pagans were non-culpably ignorant of the gospel. This view was affirmed by Pope Pius X (r. 1846-78) in his Singulari Quadam (1854): “outside the Apostolic Roman Church no one can be saved…On the other hand…those who live in ignorance of the true religion, if such ignorance be invincible, are not subject to any guilt in this matter before the eyes of the Lord.” (Neuner and Dupuis 2001, 311)

Nineteenth century popes condemned Enlightenment-inspired theories of religion pluralism about truth and salvation, then called “indifferentism,” it being, allegedly, indifferent which major religion one chose, since all were of equal value. At the same time, they argued that many people who are outside the one church cannot be blamed for this, and so will not be condemned by God.

Such views are consistent with exclusivism in the sense that Roman Catholic Christianity is the one divinely provided and so most effective instrument of salvation, as well as the most true religion, and the “true religion” in the sense that any claim which contradicts it official teaching is false. Letters by Pius XII (r. 1939-58) declared that a “by an unconscious desire and longing” non-Catholics may enjoy a saving relationship with the church. (Dupuis 2001, 127-9) Whether these non-Catholics are thought to be in the church by a non-standard means, or whether they are said to be not in the church “in reality” but only “in desire,” it was held that they were saved by God’s grace. (Neuner and Dupuis 2001, 329)

Since the Vatican II council (1962-5), many Catholic theologians have embraced what most philosophers will consider some form of inclusivism rather than a suitably qualified exclusivism, with a minority opting for some sort of pluralism. (On the majority inclusivism, see section 4b below.) The impetus for this change was fueled by statements from that council (their Latin titles: Lumen Gentium, Ad Gentes, Nostra Aetate, Gaudium et Spes, Heilsoptimismus), which are in various ways positive towards non-Catholics. One asserts not merely the possibility, but the actuality of salvation for those who are inculpably ignorant of the gospel but who seek God and try to follow his will as expressed through their own conscience. Another, without saying that people may be saved through membership in them, affirms various positive values in other religions, including true teachings, which serve as divinely ordained preparations for reception the gospel. Catholics are exhorted to patient, friendly dialogue with members of other religions. (Dupuis 2000, 161-5) Some Catholic theologians have seen the seeds or even the basic elements of inclusivism in these statements, while others view them as within the orbit of a suitably articulated exclusivism. (Dupuis 2000, 165-170) A key area of disagreement is whether or not these imply that a person may be saved by means of their participation in some other religion. Still other Catholic theologians have found these moves to be positive but not nearly different enough from the more pessimistic sort of exclusivism. Such theologians, prominently Hans Küng (b. 1928) and Paul Knitter (b. 1939), have formulated various pluralist theories. (Kärkkäinen 2003, 197-204, 309-17)

Protestant versions of exclusivism can be at least as strict as Augustine’s. Recently called “restrictivism,” this position insists that explicit knowledge of the gospel of Jesus Christ is necessary for salvation, and there is no hope for those who die without having heard the gospel. (McDermott and Netland 2014, 148) But these are sometimes tempered with loopholes such as: a universal chance to hear the gospel at or after death, a free pass to people who die before the “age of accountability,” or the view that less was required to be saved in pre-Christian times. Another view which is taken by Bible-oriented evangelical Protestants allows the possibility of non-Christians receiving saving grace, but is firmly agnostic as to whether this actually occurs, and if it does, how often, because of the paucity of relevant biblical statements. (McDermott and Netland 2014) Other Protestants choose forms of inclusivism similar to Rahner’s (see 4b below).

4. Inclusivism

a. Terminological Problems

On the one hand, it is difficult to consistently distinguish inclusivism from exclusivism, because the latter nearly always concedes some significant value to other religions. “Inclusivism” for some authors just means a friendlier or more open-minded exclusivism. On the other hand, many theorists want to adopt the friendly and broad-minded sounding label “pluralism” for their theory, even though they clearly hold that one religion is uniquely valuable. For example, both Christians and Buddhists have adopted religious-diversity-celebrating rhetoric while clearly denying anything described above in this article as a kind of “pluralism” about religious diversity. (Dupuis 2001; Burton 2010)

b. Abrahamic Inclusivisms

Historically, Jewish intellectuals have usually adopted an inclusivist rather than an exclusivist view about other religions. A typical Rabbinic view is that although non-Jews may be reconciled to God, and thus gain life in the world to come, by keeping a lesser covenant which God has made with them, still Jews enjoy a better covenant with God. Beginning in the late twentieth century, however, some Jewish thinkers have argued for pluralism along the lines of various Christian authors, revising traditional Jewish theology. (Cohn-Sherbok 2005)

Since the latter twentieth century many Roman Catholic theologians have explored non-exclusivist options. As explained above (section 3c) a major impetus for this has been statements issued by the latest official council (Vatican II, 1962-5). One goes so far as to say that “the Holy Spirit offers to all [humans] the possibility of being associated, in a way known to God, with the Paschal Mystery [that is, the saving death and resurrection of Jesus].” (Gaudium et Spes 22, quoted in Dupuis 2001, 162) Some Catholic theologians see the groundwork or beginning in these documents for an inclusivist theory, on which other religions have saving value.

Influential German theologian Karl Rahner (1904-84), in his essay “Christianity and the Non-Christian Religions,” argues that before people encounter Christianity, other religions may be the divinely appointed means of their salvation. Insofar as they in good conscience practice what is good in their religion, people in other religions receive God’s grace and are “anonymous Christians,” people who are being saved through Christ, though they do not realize it. All Christians believe that some were saved before Christianity, through Judaism. So too at least some other religions must still be means for salvation, though not necessarily to the same degree, for God wills the salvation of all humankind. But these lesser ways should and eventually will give way to Christianity, the truest religion, intended for all humankind. (Plantinga 1999, 288-303)

Subsequent papal statements have moved cautiously in Rahner’s direction, affirming the work of the Holy Spirit not only in the people in other religions, but also in those religions themselves, so that in the practice of what is good in those religions, people may respond to God’s grace and be saved, unbeknownst to them, by Christ. Nonetheless, the Roman Catholic church remains the unique divine instrument; no one is saved without some positive relation to it. (Dupuis 2001, 170-9; Neuner and Dupuis 2001, 350-1)

Although many traditional Protestant Christians hold some form of exclusivism, others favor an inclusivism much like Rahner’s. (Peterson et. al. 2013, 333-40) Theologically liberal Protestants most often hold on to some form of religious pluralism.

As a relative latecomer which has always acknowledged the legitimacy of previous prophets, including Abraham, Moses, and Jesus, while proclaiming its prophet to be the greatest and last, Islam has, like Judaism, tended towards inclusivist views of other religions. The traditional Islamic perspective is that while in one sense “Islam” was initiated by Muhammad (570-632 CE), “Islam” in the sense of submission to God was taught by all prior prophets, and so their followers were truly Muslims, that is, truly submitted to God. Still, given that Muhammad is the seal of the prophets, his teachings and practices should, and some day will supersede all previous ones. Recent Islamic thinkers have independently come to conclusions parallel to those of Rahner, while critiquing various pluralist theories as entailing the sin of unbelief (kufr), the rejection of Islam.

It is a matter of dispute whether certain famous Sufi Muslims such as Rumi (1207-73) and Ibn ‘Arabi (1165-1240) have held to some form of religious pluralism. (Legenhausen 2006, 2009)

c. Buddhist Inclusivisms

While there have been Buddhist teachers and movements who have been exclusivists, in general Buddhism has been inclusivist. Buddhism has long been very doctrinally diverse, and many schools of Buddhism argue that theirs is the truest teaching or the best practice, while other versions of the dharma are less true or less conducive to getting the cure, and have now been superseded. It has been typical also for Buddhist thinkers to hold that at best, the same is true of other religious traditions. (Burton 2010) On the other hand, some religions’ teachings are simply false and their practices are unhelpful; the contents of their prescribed beliefs and practices matter.

Some Buddhist texts teach that there can be a solitary Buddha (pratyekabuddha), a person who has gained enlightenment by his own efforts, independently of Buddhist teaching. Such a person is outside of the tradition, yet obtains the cure taught by the tradition. This is an inclusivist view about getting the cure, and about central religious truths. There are even cases of “Buddhists seeking to turn devotees of other religions traditions into ‘anonymous Buddhists’ who worship Buddhist deities without realizing that this is the case.” (Burton 2010, 11)

d. Plural Salvations Inclusivism

Forming his views by way of a detailed critique of various core and identist pluralist theories, Baptist theologian S. Mark Heim (b. 1950) proposes what he calls “a true religious pluralism,” which is nonetheless best understood as a version of inclusivism, as it allows its proponent to maintain the superiority of her own religion. (Heim 1995, 7)

Heim notes that pluralists like Hick insist on one true goal or “salvation” which is achieved by all the equally valuable religions, a goal which is proposed by the pluralist and which differs from those proposed by most of those religions. Heim suggests that we should instead assume that other religions both pursue and achieve real and distinct religious “salvations” (goals or ends). For instance, as an inclusivist Christian, Heim holds that Buddhists really do attain Nirvana. But doesn’t Christian tradition demand that each person eventually either achieves fellowship or union with God, or is irrevocably damned? Heim suggests that those who attain Nirvana would be, from a Christian perspective, either a subgroup of the saved or of the damned, depending on just what, metaphysically, is actually going on with such people. (Heim 1995, 163) This is consistent with the Christian thinking that the end pursued by Christians is in fact better than all others; thus, heaven is better than Nirvana. However, God has ordained Nirvana as a goal suitable for some non-Christians to both pursue and attain. In this and in a later book Heim asserts that such a plurality of ordained religious goals is implied by the doctrine of the Trinity.

It is far from clear that Heim is correct that this stance will be consistent with the claims of the “home” religion. Importantly, he construes the various religious goals as “experiences” obtaining in this life and continuing beyond. (Heim 1995, 154-5, 161) This is an important qualifier, as various religious goals clearly presuppose contrary claims. For example, in Theravada Buddhism, one must realize that there is no self, whereas in Advaita Vedanta Hinduism one must gain awareness that one’s true self is none other than the ultimate reality, Brahman. Similarly, in Christianity, one must realize that one’s self is a sinner in need of God’s grace. It is impossible that all three experiences are veridical. But Heim’s theory does not require them to be, but only that they occur and may be plausibly thought of as fulfilling to those who have them.

Heim strenuously objects to pluralist theories that they impose uniformity on the various religions. However, his theory seems to depend crucially on the existence of many human problems, each of which may be solved by participation in some religion or other. In contrast, each of the various religions claims to have discerned the one fundamental problem facing humans, namely, the problem from which other problems derive. In the terms explained above, a religion claims to have a diagnosis (section 1 above). This seems incompatible with Heim’s agnosticism about which, if any, of the diagnosed problems is the fundamental one. If a religion cures only a shallow, derivative human problem, leaving the deeper problem intact, then what it offers would not deserve the name “salvation,” for it would leave those who achieve it still in need of the cure. (Peterson et. al. 2013, 333) For instance, if Theravada Buddhism is correct that humans are trapped in the cycle of rebirth by craving and ignorance, even if one goes to a heavenly realm upon death, such as envisaged by non-Buddhist religions, one is still trapped in samsara, in this realm of suffering, albeit at a higher tier. How can a Theravada Buddhists accept that such a heavenly next life is a good and final end for non-Buddhists? Again, if a Christian diagnosis is correct, that humans are alienated from and need to be reconciled to God, yet some manage to attain Nirvana, they would still lack the cure, for it is no part of Nirvana that one is reconciled to God.

5. References and Further Reading

  • Abe, Masao. “A Dynamic Unity in Religious Pluralism: A Proposal from the Buddhist Point of View.” The Experience of Religious Diversity. Ed. John Hick and Hasan Askari. Brookfield, Connecticut: Gower Publishing Company, 1985. 167-227. Partially reprinted in Readings in Philosophy of Religion: East Meets West. Ed. Andrew Eshleman. Malden, Massachusetts: Blackwell, 2008. 395-404.
    • Presents an ultimist pluralism modeled on the Mahayana Buddhist “three bodies of the Buddha” doctrine.
  • Bogardus, Tomas. “The Problem of Contingency for Religious Belief,” Faith and Philosophy 30.4 (2013): 371-92.
    • Rebuts sophisticated arguments by Hick, Kitcher, and others, that you cannot know that some religious claim is true because had you been born in another place or time, you would not have believed that claim.
  • Burton, David. “A Buddhist Perspective.” The Oxford Handbook of Religious Diversity. New York: Oxford University Press, 2010. 321-36.
    • Surveys Buddhist views on religious diversity.
  • Byrne, Peter. Prolegomena to Religious Pluralism: Reference and Realism in Religion. New York: St. Martin’s Press, 1995.
    • Explores in depth without endorsing an identist religious pluralism.
  • Byrne, Peter. “It is not Reasonable to Believe that Only One Religion is True.” Contemporary Debates in Philosophy of Religion. Ed. Michael Peterson and Raymond VanArragon. Malden, MA: Blackwell, 2004. 201-10.
    • Argues for an identist pluralism and against “confessionalism” (either inclusivism or exclusivism).
  • Cohn-Sherbok, Dan. “Judaism and Other Faiths.” The Myth of Religious Superiority: A Multifaith Exploration. Ed. Paul F. Knitter. Maryknoll, New York: Orbis Books, 2005. 119-32.
    • An overview of traditional, inclusivist Jewish views of other religions, then arguing that Jewish theology should be revised to accommodate an identist pluralism.
  • Cooper, John W. Panentheism: The Other God of the Philosophers: From Plato to the Present. Grand Rapids, Michigan: Baker Academic, 2006.
    • Chapter 7 is an accessible introduction to the metaphysics and theology of Whitehead and Hartshorne, without which the ultimist pluralism of Cobb and Griffin can’t be understood.
  • Datta, Narendra [Swami Vivekananda]. “Hinduism.” The Penguin Swami Vivekananda Reader. Ed. Makarand Paranjape. New Delhi: Penguin Books India, 2005 [1893]. 43-55.
    • One of several hit speeches given at the first World Parliament of Religions in Chicago in 1893, asserting that all religions share one object and goal, although Hinduism is more tolerant, peaceful, and flexible than other traditions.
  • de Cea, Abraham Vélez. “A Cross-cultural and Buddhist-Friendly Interpretation of the Typology Exclusivism-Inclusivism-Pluralism.” Sophia 50 (2011): 453-80.
    • An attempt to clarify and expand the common trichotomy, adding a fourth category, “pluralistic inclusivism.”
  • Dupuis, Jacques. Toward a Christian Theology of Religious Pluralism. Maryknoll, New York: Orbis Books, 2001 [1997].
    • Survey of the long evolution of Roman Catholic thought on religious diversity, arguing for an inclusivist theory.
  • Feuerbach, Ludwig. Lectures on the Essence of Religion. Translated by Ralph Manheim. New York: Harper and Row, 1967 [1851]
    • A naturalistic, humanistic, atheistic critique of belief in God as a product of human desire and imagination; a form of negative pluralism.
  • Griffin, David Ray. “Religious Pluralism: Generic, Identist, Deep.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky: Westminster John Knox Press, 2005a. 3-38.
    • Surveys sophisticated recent pluralist theories by Hick, Smith, Knitter, Cobb, and criticisms of these. Argues for the superiority of his own ultimist (“deep”) religious pluralism.
  • Griffin, David Ray. “John Cobb’s Whiteheadian Complementary Pluralism.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky: Westminster John Knox Press, 2005b. 39-66.
    • Presentation of ultimist pluralism as developed by Cobb and Griffin.
  • Hasker, William. “The Many Gods of Hick and Mavrodes,” Evidence and Religious Belief. Ed. Kelly James Clark and Raymond J. VanArragon. New York: Oxford University Press, 2011. 186-98.
    • Critical discussion of Hick’s views on the personae and impersonae of religious experience, which are supposed to be manifestations of the Ultimate Reality.
  • Heim, S. Mark. Salvations: Truth and Difference in Religion. Maryknoll, New York: Orbis, 1995.
    • Critiques various pluralistic theories as insufficiently respectful of the real differences between religions and proposes a plural salvations inclusivism.
  • Hick, John. God Has Many Names. Philadelphia: The Westminster Press, 1982 [1980].
    • A short book written at a crucial juncture in Hick’s thinking about religious diversity; probably the best place to start in understanding Hick’s views.
  • Hick, John. The Rainbow of Faiths [U.S. title: A Christian Theology of Religions: The Rainbow of Faiths]. London: SCM Press, 1995.
    • A short and popular exposition of, and development of his mature views as expounded in his 1989 An Interpretation; mostly written in the form of imagined dialogues.
  • Hick, John. “The Epistemological Challenge of Religious Pluralism.” Faith and Philosophy 14.3 (1997): 277-86. Reprinted in Hick 2010, 25-36.
    • Argues that religious exclusivism and inclusivism face devastating epistemological problems; see Hick 2010 for his exchanges with some leading Christian philosophers about this piece.
  • Hick, John. “Ineffability,” Religious Studies 36.1 (2000): 35-46. Reprinted in Hick 2010, ch. 3.
    • Replies to criticisms of the ineffability or trancategoriality of “the Real” by Rowe and Insole.
  • Hick, John. An Interpretation of Religion: Human Responses to the Transcendent, 2nd ed. New Haven, Connecticut: Yale University Press, 2004 [1989].
    • The main exposition of what is widely considered the best-developed pluralist theory (esp. ch. 14-16), espousing the practical equality of “post-axial” religions (ch. 2-4). Its long introduction summarizes his replies to many critics.
  • Hick, John, ed. Dialogues in the Philosophy of Religion. New York: Palgrave-MacMillan, 2010 [2001].
    • Reprints and continues Hick’s exchanges in the late 1990s with a number of prominent philosophers and theologians.
  • Hick, John. “Response to Hasker.” Evidence and Religious Belief. Ed. Kelly James Clark and Raymond J. VanArragon. New York: Oxford University Press, 2011. 199-201.
    • Hick clarifies his claims regarding the personae and impersonae by means of which people interact with the Ultimate Reality.
  • Ignatius of Antioch, “The Letter of Ignatius to the Philadelphians,” in The Apostolic Fathers: Greek Texts and English Translations, 3rd ed. Ed. Michael W. Holmes. Grand Rapids, Michigan: BakerAcademic, 2007. 236-47.
    • Early Christian writing, probably from the first half of the second century, in which the bishop says that followers of schismatic leaders will not be saved.
  • Kärkkäinen, Veli-Matti. An Introduction to the Theology of Religions. Biblical, Historical, and Contemporary Perspectives. Downers Grove, Illinois: InterVarsity Press, 2003.
    • Wide-ranging discussion of Christian responses to religious diversity from biblical times up till the present, valuable for its summaries of ancient, early modern, and recent theological sources.
  • King, Nathan. “Religious Diversity and its Challenges to Religious Belief.” Philosophy Compass 3/4 (2008): 830-53.
    • Lucid survey of varieties of exclusivism, inclusivism, the identist pluralism of John Hick, and epistemological difficulties arising from disagreements about religious matters.
  • Kiblinger, Kristin. Buddhist Inclusivism: Attitudes Towards Religious Others. Burlington, Vermont: Ashgate, 2005.
    • Sympathetic description and criticism of Buddhist inclusivism by a non-Buddhist scholar.
  • Legenhausen, Hajj Muhammad [Gary Carl] “Why I am not a Traditionalist.” 2002.
    • Online overview of traditionalist core pluralism and critique from the perspective of Shia Islam.
  • Legenhausen, Hajj Muhammad [Gary Carl]. “A Muslim’s Proposal: Non-Reductive Religious Pluralism.” 2006.
    • Insightful online article classifying theories of religious pluralism and arguing for what the author calls “non-reductive pluralism” (here described as an example of Abrahamic Inclusivism, 4b above) by a philosopher who is an American convert to Shia Islam.
  • Legenhausen, Hajj Muhammad [Gary Carl]. “On the Plurality of Religious Pluralisms.” International Journal of Hekmat 1 (2009): 6-42.
    • The most comprehensive classification of varieties of theories of religious pluralism.
  • Long, Jeffery D. “Anekanta Vedanta: Towards a Deep Hindu Religious Pluralism.” Deep Religious Pluralism. Ed. David Ray Griffin. Louisville, Kentucky, 2005. 130-57.
    • Exploration of ultimist (“deep”) religious pluralism by a scholar who is an American convert to Hinduism; argues that whether or not “Hinduism” is pluralistic or inclusivist depends on whether it is understood as Vedic tradition, Indian tradition, or Sanatana Dharma [eternal religion].
  • Long, Jeffery D. “Universalism in Hinduism.” Religion Compass 5/6 (2011): 214-23.
    • Survey of historical and recent pluralist theories in Hinduism (here called versions of “universalism”) and criticisms thereof.
  • Mawson, T.J. “‘Byrne’s’ religious pluralism.” International Journal for Philosophy of Religion 58.1 (2005): 37-54.
    • Negative critique of the identist pluralism of Peter Byrne.
  • McDermott, Gerald and Netland, Harold. A Trinitarian Theology of Religions: An Evangelical Proposal. New York: Oxford University Press, 2014.
    • A recent evangelical Protestant version of exclusivism, embedded in a Christian theology of religions.
  • Meeker, Kevin. “Exclusivism, Pluralism, and Anarchy.” God Matters: Readings in the Philosophy of Religion. Ed. Raymond Martin and Christopher Bernard. New York: Pearson, 2003. 524-35.
    • Shows how versions of exclusivism, inclusivism, and pluralism can be viewed on a continuum, as excluding more or fewer religions; contrasts Hick’s “altruistic pluralism” with an more open-minded but less plausible “anarchic pluralism.”
  • Morales, Frank [Sri Dharma Pravartaka Acharya]. Radical Universalism: Does Hinduism Teach That All Religions Are the Same? New Delhi: Voice of India, 2008.
    • American-born Hindu teacher and scholar argues against the pluralism of Datta [Vivekananda] and Chattopadhyay [Ramakrishna] that it is: incoherent, inconsistent with the facts of religious diversity, foreign to Hinduism, relativistic, intolerant, destructive of Hinduism, and based on misinterpretations of Hindu scriptures.
  • Netland, Harold. Encountering Religious Pluralism: The Challenge to Christian Faith and Mission. Downers Grove, Illinois: InterVarsity Press, 2001.
    • A defense of Christian “particularism” (compatible with the descriptions “exclusivism” or “inclusivism” in this article) by an evangelical theologian, with summaries of earlier missionary literature and criticisms of pluralist theories.
  • Neuner, Josef and Dupuis, Jacques, eds. The Christian Faith in the Doctrinal Documents of the Catholic Church, Seventh Revised and Enlarged Edition. Bangalore: St. Peter’s Seminary, 2001.
    • Collection of Roman Catholic primary sources, including documents relating to the uniqueness of Catholicism, the idea that there is no salvation outside the church, and views on non-Catholic Christianity and non-Christian religions.
  • O’Connor, Timothy. “Religious Pluralism.” Reason for the Hope Within. Ed. Michael Murray. Grand Rapids, Michigan: Eerdmans, 1999. 165-81
    • Defends “exclusivism” (rejection of any pluralism) against arguments that it is arbitrary, arrogant, or irrational, and argues that Hickian identist pluralism is incoherent.
  • Peterson, Michael. et. al. Reason and Religious Belief, 5th ed. New York: Oxford University Press, 2013.
    • Leading philosophy of religion textbook with excellent chapter (14) on pluralism, exclusivism, and inclusivism.
  • Plantinga, Cornelius, ed. Christianity and Plurality: Classic and Contemporary Readings. Malden, Massachusetts: Blackwell, 1999.
    • Collection of Christian documents concerning religious diversity, starting with the Bible and ending with a statement by Pope John Paul II.
  • Prothero, Stephen. God is Not One. The Eight Rival Religions that Run the World – and Why Their Differences Matter. New York: HarperOne, 2010.
    • Introductory overview of eight religious traditions which aims to undermine “the new orthodoxy” of naive or core pluralism.
  • Quinn, Philip L. and Kevin Meeker, eds. The Philosophical Challenge of Religious Diversity. New York: Oxford University Press, 2000.
    • Important anthology of philosophical pieces, largely consisting of attacks on and defenses of Hick’s identist pluralism.
  • Rowe, William. “Religious Pluralism.” Religious Studies 35.2 (1999): 139-50.
    • Argues that Hick’s central claim that the Ultimate Reality is ineffable is incoherent.
  • Schmidt-Leukel, Perry. “Exclusivism, Inclusivism, Pluralism: The Tripolar Typology – Clarified and Reaffirmed.” The Myth of Religious Superiority: A Multifaith Exploration. Ed. Paul F. Knitter. Maryknoll, New York: Orbis Books, 2005, 13-27.
    • Catalogues and responds to the many objections various authors have given to the standard trichotomy, and precisely defines it in terms of giving knowledge sufficient to give people the cure which religion offers.
  • Sedgwick, Mark. Against the Modern World: Traditionalism and the Secret Intellectual History of the Twentieth Century. New York: Oxford University Press, 2004.
    • Intellectual history of “traditionalism” or “perennialism” (core pluralism).
  • Sharma, Arvind. “All religions are – equal? one? true? same?: a critical examination of some formulations of the Neo-Hindu position.” Philosophy East and West 29.1 (January 1979): 59-72.
    • An attempt to clarify the sort of pluralism popular in Hinduism since Datta.
  • Sharma, Arvind. A Hindu Perspective on the Philosophy of Religion. London: MacMillan, 1990.
    • Chapter 9 is a basic introduction to the pluralistic orientation of modern Hinduism, interacting with a few western scholars (W.A. Christian, W.C. Smith, and Hick).
  • Sharma, Arvind. The Concept of Universal Religion in Modern Hindu Thought. New York: St. Martin’s Press, 1999.
    • Surveys the views of leading 19th and 20th century Hindu intellectuals on the theme of “universal religion,” which can be an (alleged) fact or an unrealized ideal; some versions of religious pluralism are discussed.
  • Smart, Ninian. Dimensions of the Sacred: An Anatomy of the World’s Beliefs. Berkeley, California: University of California Press, 1996.
    • Presents a seven-fold analysis of the different aspects of religious traditions.
  • Smith, Huston. Forgotten Truth: The Common Vision of the World’s Religions. San Francisco: HarperSanFrancisco, 1992 [1976].
    • Thorough presentation of a core pluralism, as a part of what he calls “perennial philosophy,” or “traditionalism.”
  • Smith, Huston. Beyond the Postmodern Mind: The Place of Meaning in a Global Civilization, Updated and Revised. Wheaton, Illinois: Quest Books, 2003 [1982].
    • Further exposition of Smith’s “perennial philosophy,” put in the context of his diagnoses of the historical mistakes of “Modern” and “Postmodern” thinking, and his practical suggestions for the future.
  • Smith, Huston. “No Wasted Journey: A Theological Autobiography.” The Huston Smith Reader. Ed. Huston Smith and Jeffrey Paine. Berkeley: University of California Press, 2012, 3-12.
    • Smith explains his journey from Christian to religious naturalist to eclectic seeker to perennialist core pluralist.
  • Stenmark, Mikael. “Religious Pluralism and the Some-Are-Equally-Right View.” European Journal for Philosophy of Religion 2 (2009): 21-35.
    • Articulates the position named in the title as an alternative to the exclusivism-inclusivism-pluralism trichotomy, in part motivated by what he calls “the problem of emptiness” for Hick’s pluralism – roughly, that his (nearly) inconceivable Real is irrelevant to any religious concerns.
  • Tanner, Norman, ed. Decrees of the Ecumenical Councils, Volume One: Nicea I to Lateran V. Washington, D.C.: Georgetown University Press, 1990.
    • First of two volumes with the official original language texts and English translations of all twenty-one official councils recognized by the Roman Catholic church; the “Bull of union with the Copts” from the council of Florence (1442) expresses an exclusivist stance.
  • Yandell, Keith. Philosophy of Religion: A Contemporary Introduction. New York: Routledge, 1999.
    • Focuses on the differences between monotheistic religions, Advaita Vedanta Hinduism, Jainism, and Theravada Buddhism, with a thorough critique of Hick’s pluralism. (ch. 6)
  • Yandell, Keith. “Has Normative Religious Pluralism a Rationale?” Can Only One Religion Be True? Paul Knitter and Harold Netland in Dialogue Ed. Robert B. Stewart. Minneapolis: Fortress Press, 2013, 163-79.
    • Spirited attack on pluralist theories as poorly motivated and inconsistent with traditional religious beliefs.

 

Author Information

Dale Tuggy
Email: filosofer@gmail.com
State University of New York at Fredonia
U. S. A.

Adolf Lindenbaum

photo by permission of the Archives of the University of Warsaw

Adolf Lindenbaum was a Polish mathematician and logician who worked in topology, set theory, metalogic, general metamathematics and the foundations of mathematics. He represented an attitude typical of the Polish Mathematical School, consisting of using all admissible methods, independently of whether they were finitary. For example, the axiom of choice was freely applied, but on the other hand, proofs omitting this axiom were welcomed. In set theory, Lindenbaum and Tarski posed an important conjecture that the generalized continuum hypothesis entails the axiom of choice. Among the most important metalogical and metamathematical results obtained by Lindenbaum are the following: every system of propositional calculus has an at most denumerably infinite normal matrix; the construction of the so–called Lindenbaum algebra; and the maximalization theorem.

Lindenbaum studied mathematics under Wacław Sierpiński, Stefan Mazurkiewicz and Kazimierz Kuratowski in Warsaw. As part of the Lvov–Warsaw School, formed by a powerful Polish group of analytic philosophers, Lindenbaum belonged to the Polish mathematical school and the Warsaw school of logic. He began his career as a topologist, and his doctoral dissertation, written under Sierpiński, was devoted to properties of point–sets. Then in the mid–1920s he switched to logic and joined the Warsaw School of Logic that was established by Jan Łukasiewicz and Stanisław Leśniewski after World War I. Lindenbaum was a close friend and collaborator of Alfred Tarski.

Table of Contents

  1. Curriculum Vitae
  2. A General Outline of Lindenbaum’s Scientific Career and His Views
  3. Lindenbaum and Set Theory
  4. Lindenbaum and Logical Calculi
  5. Lindenbaum and General Metamathematics
  6. Final Remarks
  7. References and Further Readings

1. Curriculum Vitae

Adolf Lindenbaum was born in an assimilated (polonized) rich Jewish family in Warsaw on June 12, 1904 (see Mostowski–Marczewski 1971, Surma 1982, Zygmunt–Purdy 2014 for biographical data on Lindenbaum). He took his secondary education at M. Kreczmar’s Gymnasium in Warsaw (1915–1922) and he next entered Warsaw University to study mathematics (1922–1926) under such teachers as Kazimierz Kuratowski, Stefan Mazurkiewicz and Wacław Sierpiński in mathematics as well as Stanisław Leśniewski and Jan Łukasiewicz in logic. Lindenbaum also attended courses by Alfred Tarski on cardinal numbers and elementary mathematics, Tadeusz Kotarbiński’s course in logic and some classes in humanities, including history of philosophy (Władysław Tatarkiewicz; a general course, a special class on Kant), aesthetics (also Tatarkiewicz on French art), psychology (Władysław Witwicki), linguistics (Karol Appel), literature (Józef Ujejski on Adama Mickiewicz, the most important Polish national poet), and the history and culture of Palestine (Moses Schorr).

Sierpiński supervised Lindenbaum’s PhD dissertation entitled O własnościach mnogości punktowych (On Properties of Point–Sets). The thesis was defended in 1928 and Lindenbaum received the title of Doctor of Philosophy. In 1934 he presented his Habilitation thesis, based on several published papers, to the Faculty of Mathematics and Natural Sciences at Warsaw University and obtained the degree of Docent (a person who could lecture). This resulted in his appointment as adjunct professor at the Philosophical Seminar at the same faculty. Tthe Philosophical Seminar was an independent unit at the Faculty of Mathematics and Science directed for years by Łukasiewicz; in fact, it was the Logical Seminar. Lindenbaum lectured on various mathematical and logical topics from 1935 to 1939. His courses, for example, concerned the following topics: “On New Investigations into the Foundations of Mathematics and the Mathematical Foundations of other Disciplines”, “On Superposition of Functions” and “Selected Topics from Metrology and from the Theory of Functions”. He stood little chance of being promoted to an academic position higher than docent because of the anti–Semitic policy in Polish universities after 1935,  and his involvement in the communist movement, as well as the shortage of university positions at the time. He was also a tutor at the Scientific Circle of Jewish Students at Warsaw University, established after the exclusion (in the 1890s) of the Jews from general students’ associations existing in Poland.

Lindenbaum was a typical, good–looking bon vivant. He married a Jewish beauty, Janina Hosiasson, also a logician, who successfully worked on induction and confirmation. As Janina Hosiasson informed Alfred Tarski in one of her letters in the early 1941 (this letter is in Tarski’s Archive in U.C. Berkeley’s Bancroft Library), she and Adolf got separated at the beginning of World War II. Lindenbaum was a declared leftist. He belonged to the Polish Communist Party (KPP) until its dissolution by Stalin in 1938; He also was an activist in the intelligentsia circles. In 1936 Lindenbaum signed a petition demanding that Carl von Ossietzky, a journalist imprisoned by the Nazis, be awarded the Nobel Peace Prize, and he protested against the massacre of workers in Lvov in 1936. Mrs. Janina Kotarbiński, the wife of Tadeusz Kotarbiński, a close friend of the Lindenbaums reported the following story:

It happened that I and Antoni Pański [also a philosopher – J. W] visited the Lindenbaums in their apartment. I noticed on Dolek’s [the diminutive of Adolf – J. W.] desk the Short Philosophical Dictionary [the dictionary written by Pavel Yudin and Mark Rozental and published in Russian in the Soviet Union at the beginning of the 1930s. The authors represented the Stalinist version of Marxism. The Short Philosophical Dictionary became a symbol of the orthodox Marxist ideology – J. W.]. The book was opened at the entry ‘Dialectical Contradiction’. I was surprised, and on our way back I asked Pański why Dolek, such a clever person, read such stupidities about the concept of contradiction. Pański answered that Dolek believed in every word of this book.

She added, however, that Lindenbaum had considerable interests in philosophy without any kind of dogmatism. He was ready for an open discussion on any philosophical issue. Lindenbaum was strongly interested in literature and art, and was famous as a passionate climber.

After the outbreak of World War II, Lindenbaum immediately realized that for his own security he should escape before the German army would take Warsaw. His Jewish origin was probably not the only reason behind this decision. More importantly, it was obvious that the Germans possessed the lists of Polish communists and other critics of Hitler’s regime. Lindenbaum was particularly afraid of the consequences of the support he had given to Ossietzky, as such actions were strongly, even furiously, criticized in Germany. Although Lindenbaum could emigrate to the West or get a position in Moscow as a scientist, he decided to remain in Poland, because he has hope in the socialist future of the country. The Lindenbaums left Warsaw on September 6, 1939 and went to Vilna (presently Vilnius); this city, formerly in Poland, became the capital of Lithuania after September 17, 1939. Lithuania was an independent country in 1918–1939 with Kaunas as its capital; in 1939–1941 Lithuania was formally independent, but entirely dependent on the Soviet Union. It was occupied by Germany after its attack on the Soviet Union on June 22, 1941. Janina remained there, but Adolf moved to Bialystok, a city occupied by the Soviet Army after its invasion into Poland on September 17; he probably expected this part of Poland to become the ovule of the future Polish communist state. Lindenbaum was appointed as a docent in the Bialystok Pedagogical Institute established by the Soviet authorities, and he taught mathematics there. The German–Soviet War started on June 22, 1941, and the Germans soon came to Bialystok. The reasons why Lindenbaum did not leave the city are unknown. In September 1941, he was arrested by the Gestapo, transported to Vilnius, and killed in Ponary (the place of many massacres, particularly of the Jews in 1941–1944) near the Lithuanian capital. Janina Hossiasson–Lindenbaum was murdered in Vilnius in 1942. The exact dates of the deaths of the Lindenbaums remain unknown.

2. A General Outline of Lindenbaum’s Scientific Career and His Views

Lindenbaum began his scientific carrier as a topologist, but he soon converted to logic and the foundations of mathematics. He and Tarski (three years older) became friends in the early 1920s and the latter influenced Lindenbaum in the direction of mathematical logic and the foundations of mathematics. Both shared not only scientific interests, but also a negative attitude to any version of religion, various leftist political ideas and a love of  mountains and literature, but they were also very sensitive to their fate as secular Jews and their pretending to be assimilated and accepted by Polish society. Neither of them, however, was successful in the last respect. At the beginning of his scientific career, he was very active in the Student Mathematical Scientific Circle as well as in the Student Philosophical Circle, and he successfully promoted logic among his colleagues in both groups. In particular, he delivered several lectures on logical problems at the meetings of both circles. He was a mentor in logic to students in Warsaw. For instance, he wrote a section on logic in the Mathematical-Physical Study: Information Book for Newcomers published in 1926 in which he reported how logic was taught in Warsaw. It is a very interesting document which shows how powerful logic was in Warsaw in the mid–1920s. The revelation that the Principia Mathematica was recommended as a textbook for advanced students seems really shocking. Lindenbaum’s brilliant personality, charming style of life and powerful mathematical skills fascinated the Warsaw scientific community. Not surprisingly, he was commonly considered one of the most gifted Polish mathematicians of his generation. Tarski 1949, p. XII described Lindenbaum as “a man of unusual intelligence”. Mostowski once called Lindenbaum the most lucid mind in the foundations of mathematics. Legendary stories told by Lindenbaum’s friends and colleagues document many cases of theorems discovered by him but proved by someone else, as he had no time to complete his ideas. Yet the list of his scientific contributions is quite long; it comprises more than 40 papers, abstracts and reviews (see Surma 1982 for Lindenbaum’s bibliography), mostly published in German and French. Some of Lindenbaum’s papers were co–authored, in particular, by Tarski and Andrzej Mostowski. Lindenbaum and Tarski worked on the book Theorie der eindeutigen Abbildungen. It was announced as volume 8 of the series “Mathematical Monograph” to be published in 1938. This information is given on the back cover of S. Sachs, The Theory of Integral, Monografie Matematyczne, Warszawa–Lwów 1937 (co–published by G. E. Stechert, New York). Sachs’ book appeared as volume 7 of this series. This suggests that the book by Lindenbaum and Tarski was near to being completed. Lindenbaum was also well–perceived on the international scale. Leading logicians and mathematicians, including Wilhelm Ackermann, Friedrich Bachmann, Abraham Fraenkel, Andrei Kolmogoroff and Arnold Schmidt, reviewed his writings.  Lindenbaum actively participated in the Polish Mathematical Congresses and in the International Congress of Scientific Philosophy held in Paris in 1935.

Lindenbaum belonged to three scientific schools, namely the Polish Mathematical School with Sierpiński, Mazurkiewicz and Kuratowski as its leaders in Warsaw the Warsaw School of Logic (Łukasiewicz, Leśniewski, Tarski—the last joined the top of the School in the 1920s; see Woleński 1995 for a general presentation of logic in Poland in the interwar period) and the Lvov–Warsaw School (Kazimierz Twardowski and his disciples from Lvov, in particular, Łukasiewicz, Leśniewski and Kotarbiński). The second affiliation is perhaps the most important. The Warsaw School of Logic was a “child” of mathematicians and philosophers. Łukasiewicz and Leśniewski, the main figures in this group were, as I have already noted, philosophers by training. Nevertheless, they became professors at the Faculty of Mathematics and Natural Science at Warsaw University and were active in the mathematical environment. Zygmunt Janiszewski, another founding father of the Polish Mathematical School, who died prematurely before Lindenbaum entered the university, developed the so–called Janiszewski program, a very ambitious plan of the development of mathematics in Poland that attributed crucial significance to logic and the foundations on mathematics. The Fundamenta Mathematicae, a journal established by Janiszewski and serving from its inception as an official scientific journal of the Polish Mathematical School, published many papers by logicians. It is noteworthy that Mazurkiewicz, Sierpiński, Leśniewski and Łukasiewicz—two professional mathematicians and two logicians originating from philosophy—formed the Editorial Board of the Fundamenta. It was important for the subsequent stormy development of logic in Warsaw that the mathematical milieu accepted philosophers as professional teachers of students of mathematics.

This double heritage, philosophical as well as mathematical, determined the scientific ideology of Warsaw logicians. Perhaps one point should be mentioned here as particularly important in this context. Firstly, the Polish Mathematical School did not assume any specific philosophical standpoint concerning the nature of mathematics. Methodologically speaking, all fruitful mathematical methods, particularly coming from set theory, could be used in logical investigations, provided that they did not lead to contradictions. The last statement should be understood in the following way. Clearly, proofs of consistency are important and required, however even before Gödel’s second incompleteness theorem (roughly speaking, that the consistency of arithmetic cannot be proved in arithmetic itself) was announced, Polish mathematicians maintained that if it is empirically known that a theory of given concepts is contradiction–free, it can be faithfully used in mathematical investigations including logical research. As a consequence, the Polish Mathematics School did not subscribe to logicism, formalism or intuitionism as the main foundational currents in the philosophy of mathematics, even though they were widely regarded as such in 1900–1930. On the other hand, Polish logicians worked on many problems suggested by Russell, Hilbert and Brouwer, the main exponents of the mentioned schools. Moreover, sometimes a tension held between private, so to speak, philosophical views of some Polish logicians and their research practices. For instance, Tarski, influenced by Leśniewski’s and Kotarbiński’s nominalism, expressed explicit sympathy with this view, but on the other hand, he did not hesitate to use higher set theory and inaccessible cardinals in the foundations of mathematics.

There is practically nothing known about Lindenbaum’s philosophical views concerning mathematics. Clearly, his inclinations to dialectical materialism had no influence on his philosophy of mathematics and foundational views. In fact, he shared the general attitude of the Polish Mathematical School mentioned above. Lindenbaum published some papers directly related to general foundational problems (for example, Lindenbaum 1930, Lindenbaum 1931) in which he recommended the use of mathematics in logical investigations without any hesitation with respect to employing infinitary methods; he pointed out that such methods were present even in elementary arithmetic. For instance, he published reviews of works by Polish radical nominalists, such as Leon Chwistek and Władysław Hetper, but he entirely abstained from philosophical comments. Two papers—Lindenbaum 1936 and Lindenbaum–Tarski 1934–1935—should be particularly mentioned. The former paper,  is Lindenbaum’s contribution to the already mentioned Paris Congress in 1935,; concerns the formal simplicity of concepts. Although Lindenbaum points out that this question arises in many fields, he does not offer any general definition of simplicity. Lindenbaum distinguishes seven relevant problems concerning simplicity (a) of systems of concepts (terms); (b) of propositions and their systems; (c) of inference rules; (d) of proofs; (e) of definitions and constructions; (f) of deductive theories; (g) of formal languages; but he addresses his further considerations to (a). The idea is that measuring the number of letters occurring in a given concept can tell us about the simplicity of the term (Lindenbaum follows Leśniewski in this respect). Two points are interesting. First, Lindenbaum assumes the simple theory of types.  This suggests that he preferred a more elementary construction if it is possible and adequate for a given problem. This attitude was also very popular among Warsaw mathematicians and logicians. Second, Lindenbaum observes, obviously under Tarski’s influence, that simplicity has not only a syntactic dimension (in Carnap’s sense), but it should also be considered semantically.

The paper Lindenbaum–Tarski 1934–1935 basically concerns some metamathematical problems about the limitations of means of expressions (expressive power, to use present terminology) in deductive theories. The authors claim that all relations between objects (individuals, classes, relations, and so forth) expressible by purely logical means remain invariant under an arbitrary one–one mapping of the “world” (that is, the collection of all individuals) onto itself. Moreover, this invariance is logically provable. This idea was more fully developed by Tarski in his paper on logical concepts (see Tarski 1986). In a sense, the understanding of logical concepts as invariant under all one–one mappings has affinities with the Erlangen Program (Tarski stressed this point) in the foundations of geometry formulated by Felix Klein. Lindenbaum and Tarski proved a general theorem justifying the intuitive explanation of understanding logical relations as invariant under one–one mappings. The consequences of this approach to logical concepts for philosophy of logic are far–reaching. In particular, the Lindenbaum–Tarski definition of logical concepts motivates the theorem that logic does not distinguish any extralogical concept (roughly speaking, what can be proved in logic about an extralogical item, for instance, an individual, can be proved about any other individual). Consequently, logical theorems are true in all possible worlds (models). Thus, the definition in question naturally leads to seeing logic as invariant with respect to any specific content.

Lindenbaum’s works that are related to logic and the foundations of mathematics concern set theory and logical calculi, including their metalogical properties. This article deals with general set theory in section 3, while sections 4 and 5 are devoted to logical matters (Section 3 skips special topics, including those belonging to other mathematical fields; the borderline between general and special set theory is somehow arbitrary; also some Lindenbaum’s results will be mentioned without entering into formal details.).

Lindenbaum’s results in set theory were achieved in an individual collaboration with other authors, particularly Tarski and Mostowski. In the 1920s and 1930s Łukasiewicz conducted a seminar in mathematical logic. Its participants were the group of young logicians including, in addition to Lindenbaum and Tarski, Stanisław Jaśkowski, Andrzej Mostowski, Jerzy Słupecki, Bolesław Sobociński and Mordechaj Wajsberg. This seminar soon became a factory of new results in mathematical logic. Its participants collaboratively worked on problems. Lindenbaum’s results about logical calculi, as Łukasiewicz explicitly says, were achieved at this seminar. Lindenbaum frequently stated theorems, usually without proofs. His most important results were mentioned by others; some of them are to be found in Łukasiewicz–Tarski 1930 (I will refer to it as Ł–T1930).

3. Lindenbaum and Set Theory

In 1926, Lindenbaum and Tarski published the joint paper “Communication sur les recherches de la théorie de ensembles” (Lindenbaum–Tarski 1926; the abbreviation L–T1926 is used in further references). This paper is very compact. In 30 pages the authors announced many results in set theory and its applications, that were achieved by them within the “last few years”. More particularly, theorems and definitions concerned cardinal and ordinal numbers, the relations between them and the theory of one–one mappings. The results were stated without proofs. The authors noted that proofs and further developments would appear in the subsequent writings. (Perhaps the already mentioned monograph Theorie der eindeutigen Abbildungen was intended as a continuation of L–T1926; several related results are contained in Tarski 1949; Lindenbaum is mentioned in the Preface to this book as a person particularly effective in conducting research on cardinal numbers.) The spirit of the Polish Mathematical School is evident in this paper. The purely mathematical text is interrupted by historical and methodological comments; Polish mathematicians considered (and they still do) such remarks as a very important feature of mathematical prose. Due to the role of the axiom of choice (AC) in set theory and its controversial nature, results obtained without use of this axiom are grouped in a section different from the sections containingthe theorems based on AC. The paper lists 102 theorems or lemmas about cardinal numbers, 5 theorems or lemmas about properties of one–one mappings, on order types, 16 theorems or lemmas about order types, 4 theorems on the arithmetic of ordinal numbers and 19 theorems or lemmas on point–sets. In many cases, the investigations by Lindenbaum and Tarski continue the earlier works and achievements by Cantor, Dedekind, Bernstein, Fraenkel (Fraenkel), Hartogs, Korselt, König, Lebesgue, von Neumann, Russell and Whitehead, Schröder, Zermelo, Banach, Kuratowski, Leśniewski and Sierpiński. In a sense, L–T1926 can serve as a very important historical report on the state of the art in set theory and its foundational problems in the mid–1920s.

Perhaps the most important result (theorem 94) announced in L–T1926 concerns the generalized continuum hypothesis (GCH) and AC. Lindenbaum posed the problem of how AC (one of the main focuses of Polish Mathematical School) is related to Cantor’s hypothesis on alephs (the name of GCH used in L–T1926). Theorem 94 states that GCH entails AC. This result was proved by Sierpiński in 1947 (see Sierpiński 1965, pp. 43–44, Moore 1982, pp. 215–217 for a brief survey). The search for equivalents of AC became one of the Polish mathematical specialties de la maison. Lindenbaum (L–T1926, theorem 82(L)) claimed that AC is equivalent to the assertion that for arbitrary cardinal numbers m and n, m ≤* n or n ≤* m (the symbol ≤* expresses the relation between cardinal numbers m and n such that either m = 0 or every set of power n is the sum of m mutually disjoint non–empty sets). This theorem was finally proved by Sierpiński in 1949 (see Sierpiński 1965, p. 435–436).  L–T1926 also presents the material on the Cantor–Bernstein theorem (CBT; Lindenbaum and Tarski used the label “the Schröder–Bernstein theorem”), which says that for any cardinal numbers m, n, if m n and n m, then m = n (see Hinkis 2013 for a very detailed historical exposition). In particular, Lindenbaum proposed some equivalents of CBT also in the terms of order–types of one–one mappings. One of the equivalents of CBT is the following proposition: for any order–types α, β, γ, δ, if α = β + γ and γ = α + δ, then α = γ (via one–one mappings: if an well-ordered set X is similar to a segment of an ordered set B, and B is similar to a residue of A, then both sets A and B are similar). Another interesting result related to the Bernstein Division Theorem (BDT) says that for any natural number k and any cardinal numbers m, n, if km = kn, then m = n. L–T1926 claims that Lindenbaum proved BDT in its full generality and made that without AC. Yet there is a slight historical controversy concerning the scope of Bernstein’s original proof (see Hinkis, p. 139). [A ⇔ B]? Leaving this question aside, there is no doubt that L–T1926 played an important role in the development of the foundations of set theory.

Finally, consider thepaper by Lindenbaum and Mostowski (Lindenbaum–Mostowski 1938) on the independence of AC from other axioms of Zermelo–Fraenkel set theory (ZF). Abraham Fraenkel claimed that he proved the independence of AC (in its standard version postulating the existence of a choice set for any family of non–empty and disjoint sets) and its two equivalents (every set can be ordered; if for every family of finite and mutually disjoint sets, a choice set exists, then there exists a choice set for every countable family of sets with mutually disjoint set as elements) from ZF. Lindenbaum and Mostowski remark that Fraenkel’s investigations, though of great value, cannot be regarded as fully successful “because a dangerous confusion of metamathematical and mathematical notions is inherent in it”. More specifically, Fraenkel’s notions of model and function are obscure. Lindenbaum and Mostowski propose to understand the concept of function in a semantic manner, that is, as determined by the formula “a set satisfies a propositional function”.  Moreover, they point out an error in Fraenkel’s proof consisting in his treatment of permutations. Lindenbaum and Mostowski propose a modification of Fraenkel’s construction in order to correct his proof. It is done by the improving axiomatization with respect to axioms of separation, replacement and infinity. This step allows for proving that also other equivalents of AC are not derivable in ZF. The authors observe that ZF remains (relatively) consistent if we add the axiom “there exists an infinite set having not–sets as elements”. Moreover, the results cannot be obtained in the system in which sets are admitted as the only elements. These results were later completed by Mostowski. The models of the improved ZF with Urelements (items not being sets) are called the Fraenkel–Mostowski models. Perhaps a particularly interesting feature of Lindenbaum–Mostowski 1938 consists in the conscious use of semantic methods in the metamathematics of set theory.

4. Lindenbaum and Logical Calculi

This section reviews Lindenbaum’s results in metalogic of propositional calculus.  They were announced at Łukasiewicz’s seminar in mathematical logic in 1926–1930. Łukasiewicz initiated a research program devoted to propositional (sentential) logic, classical as well as many–valued (the results are collected in Ł–T1930). The two main directions of investigations were executed by Łukasiewicz’s group. First, constructions of sentential logic as axiomatic systems were undertaken. Łukasiewicz and his students looked for independent and possibly economic axioms (economic in the sense of having the minimal number of axioms consisting of the shortest number of symbols). Second, Warsaw logicians—mostly Tarski—invented and systematized the basic metalogical tools for investigations of propositional calculus. Lindenbaum, contrary to the majority of the members of  Łukasiewicz seminar, had no interest in proposing new axiomatizations of logic. His activities belonged entirely to metalogic. More specifically, Lindenbaum contributed to the matrix method, that is, to investigating sentential logic via logical matrices. They are generalizations of the well–known truth–tables.

Abstractly speaking, (note that the level of abstraction is higher in Ł–T1930), a logical matrix is an ordered quadruple M = [U, U, f, g] such that U and U’ are two arbitrary sets (in order to exclude trivial cases, both sets are assumed to be non–empty), U has at least two elements, U’ U, f is a binary function, and g is a unary function. Both functions are defined for U and take values from U. The intended interpretation is as follows: U – the set of logical values, U’ ­– the set of designated logical values. In the case of two–valued logic (1 – truth, 0 – falsehood), U’ = {1, 0} and U’ = {0}. Assume that L is a language of sentential logic. If A ∈ L, then f(A), g(A) ∈ U. Intuitively speaking, f and g are valuation functions defined for formulas of L taking values from U; if language is not taken into account, M is an algebra of (logical or other) values.  Write v(A) for “v is a logical value of A in M”. The matrix M is normal provided that if v(A) ∈ U’ and v(B) ∈ U, then v(A, B) ∈ U. Roughly speaking, values of compound formulas always belong to U, independently of whether their constituents are valued by designated values or other (undesignated) values.  If A L and for every w, w(A) = 1, then A is a tautology in M (A is verified by M). We can consider g as the function corresponding to negation and f as the counterpart of implication. Thus, vA) = 1, if v(A) = 0, vA) = 0, if v(A) = 1; v(A B) = 1, if v(A) = 0 or v(B) = 1, otherwise v(A B) = 1. These equalities show that truth–tables for implication and negation are special cases of logical matrices in the abstract sense.

Lindenbaum established several important theorems connecting propositional calculi with logical matrices (they are listed in a different order than the theorems appear in Ł–T 1930; this paper contains no proofs). Let the symbol LOG0 refer to a many–valued logic with a denumerably infinite set of logical values. Łukasiewicz defined matrices for such a system. Lindenbaum (theorem 16 in Ł–T 1930) established that LOG0 can be characterized by a matrix in which U’ = {1}, functions f and g satisfy the conditions f(x, y) = min(1, 1x + y),  g(x) = 1x and U is an arbitrary infinite set of numbers which satisfies the condition 0 < x < 1 for any element of U and is closed under the operations f and g. Lindenbaum also proved (theorem 19 in Ł–T 1930) the following logico–arithmetical result (the converse of a result obtained earlier by Łukasiewicz): for 2 m  0 and 2 n  0, we have the equivalence LOGm LOGn if and only if n1 is a divisor of m – 1. The next theorem established by Lindenbaum for n = 3 (Tarski generalized this result for any prime number) says if n is a prime number that there are only two systems L (the entire language) and LOG2 which contain LOG3 as a proper part. Lindenbaum also proved (Ł–T1930, theorem 23) that every logic LOGn is axiomatizable for any 1 n < 0.

Although the above results are general, they were mainly directed as reporting facts on many–valued logic. Lindenbaum also obtained the results on arbitrary sentential calculi and their matrices. The definition of M (plus the definition of a sentential calculus as closed by the consequence operation) implies that the set of tautologies in a matrix, that is, LOG(M), provided that M is normal, is a system. Lindenbaum announced (theorem 3 in Ł–T 1930) that every sentential calculus has at most a denumerably infinite normal matrix. This theorem was proved by Jerzy Łoś (see Łoś 1949; this work contains the first systematic treatment of logical matrices). This last work originated from systematic investigations on matrix semantics for propositional calculi (see Wójcicki 1989 for an extensive report on this field of logical research; several important results are also described in Pogorzelski 1994). The following historical speculation can illustrate the importance of theorem 3. When Heyting formalized intuitionistic sentential logic (ISC) in 1930, the question of finding a normal matrix for this logic became important. Gödel showed (in 1932) that no finite matrix (he used the term “realization”) verifies all theorems of ISC. By Lindenbaum’s result (Gödel did not refer to it), there exists a denumerably infinite normal matrix ISC. Jaśkowski constructed it in 1936. He certainly must have known Ł–T 1930, but he did not refer to it. This story makes a nice example of how influences could interplay in looking for an adequate matrix for intuitionistic propositional logic; but we have no accessible evidence that Lindenbaum’s theorem actually inspired Jaśkowski.

5. Lindenbaum and General Metamathematics

Several Lindenbaum’s contributions to general metamathematics are mostly mentioned in Tarski 1956 (referred to as T1956 below)). The two most important results achieved by Lindenbaum are his construction of the so–called Lindenbaum algebra (LIA) and the maximalization theorem LMT (frequently called the Lindenbaum Lemma). Lindenbaum observed that formulas can be the elements of logical matrices (see Surma 1967, Surma 1973, Surma 1982 for the reconstruction of Lindenbaum’s path to LIA). Then, he as well as other logicians (Łoś, for instance) generalized this idea for arbitrary languages. LIA is presented in this article for classical sentential calculus. Let L be a formal language with ¬, ∧, ∨, ⇒, ⇔ as connectives. Formulas as such, that is, variables and their well–formed strings do not constitute an algebra. The symbol [A] refers to the Lindenbaum class of formulas with respect to A. We further stipulate that B ∈ [A] if and only if ├ AB (this step gives a congruence in L) and then define –[A] as [¬A], [A] ∩ [B]  as [A B], [A] ∪ [B]  as [A ∨ B], [A] ⊆ [B] as [A ⇒ B], and [A ⇔  B] as [A] = [B]. If we denote the set of classes of formulas produced by the defined congruence by the symbol L[],  the structure < L[], –, ∩, ∪, ⊆ , = > is a Boolean algebra of formulas, that is, LIA for sentential logic. An interesting feature of this construction is that building blocks for LIA come from language. This justifies the use of the same symbols for propositional connectives in the object language (the language of propositional calculus) and the metalanguage, (the language of LIA). Lindenbaum’s construction of algebras has important applications in algebraic proofs of metalogical theorems (see Surma 1967, Rasiowa–Sikorski 1970, Surma 1973, Rasiowa 1974, Surma 1982, Zygmunt–Pardy 2014), including the completeness theorem for classical logic as well as many non–classical systems.

There is a controversy concerning the origin of LIA. According to Surma 1967, p. 128 Lindenbaum presented his idea at the 1st Polish Mathematical Congress in 1927. This fact, however, is only known from the Polish oral tradition. Significant information can be found in Rasiowa,–Sikorski 1970, 245–246, footnote 1. The first published mention was made by McKinsey with reference to Tarski’s oral communication in 1941. However, Tarski complained that the discovery of LIA should be credited to him (it is reported in the mentioned footnote in Rasiowa–Sikorski 1974).  Since Tarski’s historical claim concerning LIA is known, some authors used label “the Lindenbaum–Tarski algebra”.

MT is perhaps the most important result achieved by Lindenbaum. Its formulation is very simple: every consistent formal system has its maximal and consistent extension. The theorem was probably inspired by the concept of Post–completeness, well known in the Łukasiewicz’s group. This theorem is mentioned several times in T1956; its proof is to be found on pp. 98–100. An important feature of LMT consists in the fact that it is not intuitionistically  provable (although we know that maximal extensions exist, there is no general method of their construction) and it requires infinitistic methods (it is but weaker than AC; see below). T1956 points out many applications of LMT, for instance, that classical logic is the only consistent extension of intuitionistic logic (Tarski). The great career of LMT began after Henkin had used it in his proof of the completeness of first–order logic. This proof combines together LIA and the construction of maximal consistent sets of formulas. Henkin’s method became adapted for proving the completeness property of many logical systems. LMT is effectively equivalent to the Stone ultrafilter theorem. One can prove (see Surma  1968) that AC implies the GödelMalcev completeness theorem, and the latter entails LMT. The standard version of LMT works for countable languages (see Łoś 1955 for LMT for uncountable languages) and a compact consequence operation (omitting the property of compactness is not particularly significant). On the other hand, if LMT is strengthened by additional assumption concerning individual constants or by admitting uncountable languages, it becomes provably equivalent to AC (see also Gazzari 2014). These results help in placing LMT on the scale of infinitary methods in metamathematics.

The program of reverse mathematics (see Simpson 2009) gives a more general perspective in this respect. The symbol RCA0 refers to Peano arithmetic minus the full induction scheme (it is restricted to zero–one formulas) plus the recursive comprehension scheme; it is a relatively weak subsystem of second–order arithmetic. LMT is equivalent over RCA0 to the following propositions: weak König lemma, Gödel–Malcev completeness theorem, Gödel compactness theorem, completeness theorem for propositional logic for countable languages and the compactness theorem for propositional logic for countable languages. Since the weak König lemma is a rather mathematical (not metalogical) result, its equivalence with LMT exactly characterizes the mathematical content of the second. If LMT is associated with the consequence operation admitting the rule of substitution, we obtain the so–called relative Lindenbaum extensions (see Pogorzelski 1994, p. 318; this theorem was proved by Asser in 1959). Roughly speaking, if we take a consistent set X and a formula A such that A X, we have two consistent extensions of X, namely X ∪ {A} and X ∪ {¬A}. Clearly, by the standard LMT, there exist at least two different maximally consistent extensions. The problem how many such relative Lindenbaum sets are associated with a given consistent set X has no unique solution.

Here is a list of some other of Lindenbaum’s metamathematical results (see Tarski 1956, 32, 33, 36, 71, 297, 307, 338 for the technical details):

  • the number of all deductive systems is equal to 2o;
  • the number of all axiomatizable systems is equal to ℵ0;
  • the condition which must be satisfied in order for the sum of a deductive system to be a deductive system;
  • structural type of a theory;
  • theorems of degrees of completeness;
  • atomic (atomistic, according to an older terminology) Boolean algebra.
  • independence of primitive concepts in mathematical systems.

The last three results in the above list were achieved jointly by Lindenbaum and Tarski (see Tarski–Lindenbaum 1927), and other results inspired Tarski in his metamathematical investigations. Moreover, Tarski credited to Lindenbaum the pointing out the role of set–theoretical methods in metamathematical investigations (T1956, p. 75).

6. Final Remarks

Helena Rasiowa (in Rasiowa 1974, p. v) says that the introduction of the Lindenbaum–Tarski algebra became “one of the turning points in algebraic study of logic”. This tradition was continued,  systematized and conceptually unified by Tarski himself as well as by his American students, particularly J. C. C.McKinsey,  Bjarni Jónsson, Don Pigozzi as well as Polish logicians, notably Jerzy Łoś, Helena Rasiowa and Roman Sikorski.  Rasiowa–Sikorski 1970 can be considered as the opus magnum in this direction. A similar role should be attributed to LMT (not properly called the Lindenbaum Lemma, because its actual importance exceeds the fact of being an auxiliary device for proving other results) as a mark of mathematical content of tools used in metamathematics. Thus, Adolf Lindenbaum appears as one of the main masters in developing of mathematics for metamathematics. His results on logical matrices opened a new stage in metalogical investigations concerning propositional calculus.

7. References and Further Readings

  • Gazzari, R. 2014, “Direct Proofs of Lindenbaum Conditionals”, Logica Universalis 8, Issue 3–4, 321–343.
  • Hinkis, A. 2013, Proofs of the Cantor-Bernstein Theorem. A Mathematical Excursion, Basel: Birkhäuser.
  • Lindenbaum, A. 1930, “Remarques sur une question de la methode mathematique”, Fundamenta Mathematicae 15, 313–321.
  • Lindenbaum, A. 1931, “Bemerkungen zu den vorhergehendem “Bemerkungen” des Herrn J. v. Neumann”, Fundamenta Mathematicae 17, 335–336.
  • Lindenbaum, A. 1936, “Sur la simplicité formelle des notions”, in Actes du Congrès International de Philosophie Scientifique, VII Logique (Acutalités Scientifique et Industrieles 394), 29–38.
  • Lindenbaum, A.–Mostowski, A. 1938, “Über die Unabhängigkeit des Auswahlaxioms und einiger seiner Folgerungen”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 31, 27–32; Eng. tr.  in A. Mostowski, Foundational Studies. Selected Works, Vol. II, Warszawa–Amsterdam: PWN–Polish Scientific Publishers – North–Holland Publishing Company, 70–74.
  • Lindenbaum, A.–Tarski, A. 1926, „Communication sur la recherches de la théorie des ensembles”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 19, 299–330; repr. in A. Tarski, Collected Papers, vol. 1, 1921–1934, Basel: Birkhäuser 1986, 171–204.
  • Lindenbaum, A.–Tarski, A. 1934–1935, “Über die Beschränktheit des Ausdrucksmittel deduktiver Theorien, Ergebnisse eines mathematischen Kolloqiums 7, 15–22; Eng. tr. in Tarski 1986, 384–392.
  • Łoś, J. 1949, O matrycach logicznych (On Logical Matrices), Wrocław: Wrocławskie Towarzystwo Naukowe.
  • Łoś, J. 1955, „The Algebraic Treatment of the Methodology of Elementary Deductive Systems”, Studia Logica 2, 151–212.
  • Łukasiewicz, J.–Tarski, A. 1930, “Untersuchungen über den Aussgenkalkül”, Comptes Renduz des Séances de la Société des Sciences et des Letters de Varsovie, Classe III, 23, 30–50; Eng. tr. in Tarski 1956, 38–59.
  • Marczewski, E.–Mostowski, A. 1971 “Lindenbaum Adolf (1904–1941)”, Polski Słownik Biograficzny 17, 364b–365b;
  • Moore, G. H. 1982, Zermelo’s Axiom of Choice. Its Origins, Development, and Influence, Springer Verlag: New York – Heidelberg – Berlin.
  • Pogorzelski, W. 1994, Notions and Theories of Elementary Formal Logic, Białystok: Warsaw University – Białystok Branch.
  • Rasiowa, H. 1974, An Algebraic Approach to Non-Classical Logics, Amsterdam – Warszawa: PWN–Polish–Scientific Publishers – North–Holland Publishing Company.
  • Rasiowa, H.–Sikorski, R. 1970, The Mathematics of Metamathematics, Warszawa: PWN– Polish Scientific Publishers.
  • Sierpiński, W. 1965, Cardinal and Ordinal Numbers, Warszawa: PWN – Polish Scientific Publishers.
  • Simpson, S. G. 2009, Subsystems of Second Order Arithmetic, Cambridge: Cambridge University Press.
  • Surma, S. J. 1967, “History of Logical Applications of the Method of Lindenbaum’s Algebra”, Analele Univeritatii Bucurereşti Acta Logica 10, 127–138.
  • Surma, S. J. 1968, “Some Metamathematical Equivalents of the Axiom of Choice”, Prace z Logiki 3¸71–80.
  • Surma, S. J. 1973, “The Concept of Lindenbaum Algebra and Its Genesis, in Studies in the History of Mathematical Logic, ed. by S. J. Surma, Ossolineum, Wrocław, 239–253.
  • Surma, S. J. 1982, “On the Origins and Subsequent Applications of the Concept of Lindenbaum Algebra”, in Logic, Methodology and Philosophy of Science VI. Proceedings of the Sixth International Congress of Logic, Methodology and Philosophy of Science, Hannover 1979, ed. by L. J. Cohen, J. Łoś, H. Pfeiffer and K.–P. Podewski, Warszawa – Amsterdam: PWN Polish Scientific Publishers – North–Holland Publishing Company, 719–734.
  • Tarski, A.–Lindenbaum, A. 1927, “Sur l’indépendance des notions primitives dans les systèmes mathématiques”, Annales de la Société Polonaise de Mathématique 7, 111–113.
  • Tarski, A. 1949, Cardinal Algebras, New York: Oxford University Press.
  • Tarski, A. 1956, Logic, Semantics, Metamathematics. Papers from 1923 to 1939, Oxford: Clarendon Press.
  • Tarski, A. 1986, “What are Logical Notions?”, History and Philosophy of Logic 7, 143–154.
  • Woleński, J. 1995, “Mathematical Logic in Poland 1900–1939: People, Institutions Circles, Institutions, Ideas”, Modern Logic V(4), 363–405; repr. in J. Woleński, Essays in the History of Logic and Logical Philosophy, Kraków: Jagiellonian University Press 1999, 59–84.
  • Wójcicki, R. 1989, Theory of Logical Calculi. Basic Theory of Consequence Operations, Dordrecht: Kluwer Academic Publishers.
  • Zygmunt, J.–Purdy, R. 2014, “Adolf Lindenbaum: Notes on His Life with Bibliography and Selected References”, Logica Universalis 8, Issue 3–4, 285–320.

 

Author Information

Jan Woleński
Email: wolenski@if.uj.edu.pl
University of Technology, Management and Information
Poland

Institution Theory

Institution theory is a very general mathematical study of formal logical systems—with emphasis on semantics—that is not committed to any particular concrete logical system. This is based upon a mathematical definition for the informal notion of logical system, called institution, which includes both syntax and semantics as well as the relationship between them. Because of its very high level of abstraction, this definition accommodates not only well-established logical systems but also very unconventional ones; and moreover it has served and it may serve as a template for defining new ones.

There is some criticism that the abstraction power of institutions is too much, allowing for examples that can hardly be recognised as logical systems. Institution theory is nevertheless part of the universal logic trend (Béziau, 2012) which approaches logic from a relativistic, non-substantialist perspective, that is quite different from the common reading of logic, both in philosophy and in the exact sciences. However, institution theory should not be regarded as opposed to the established tradition of logic since it includes it from a higher abstraction level. In fact the real difference may occur at the level of methodology, top-down (in the case of institution theory) versus bottom-up (in the case of conventional logic tradition). This means that, in institution theory, concepts come naturally as presumed features that a logical system might or might not exhibit, and they are defined at the most appropriate level of abstraction; hypotheses are kept as general as possible and introduced on an as-needed basis. These lead to a deeper understanding of logic phenomena that is not hindered by the largely irrelevant details of particular logical systems, but are guided by structurally-clean causality. In the exposition, after discussing the history of institution theory, its main concepts are presented. Then there is a discussion of the main contributions of institution theory, including a wide range from pure mathematical logic to applied computing science. A special point here is the institution-theoretic method for doing logic by translation, which means handling logical issues rather indirectly by transporting them across logical systems and solving them at the most appropriate place. After this some extensions of mainstream institution theory are presented briefly.

Table of Contents

  1. History
    1. Origins
    2. Early Developments
    3. Later Developments
    4. Notes
  2. The Concept of Institution
    1. Signatures
    2. Sentences
    3. Models
    4. Satisfaction
    5. Concrete Institutions
    6. Notes
  3. Institution-independent Model Theory
    1. The Galois Connection between Syntax and Semantics
    2. Logical Connectors
    3. Quantifiers
    4. Diagrams
    5. Ultraproducts
    6. Interpolation and Definability
    7. Layered Completeness
    8. Notes
  4. Logic by Translation
    1. Morphisms and Comorphisms
    2. Encodings
    3. Borrowing
    4. Notes
  5. Contributions to Computing Science
    1. Structured Specifications
    2. Heterogeneous Specifications
    3. Ontologies
    4. Logic Programming
    5. Notes
  6. Extensions
    1. Many-valued Truth
    2. Modalities
    3. Notes
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources
    3. Auxiliary Non-Institutional Sources

1. History

a. Origins

Institution theory was introduced by Joseph Goguen and Rod Burstall in the late seventies as a response to the explosion in the population of logical systems in use in formal specification theory and practice. Formal specification is a logic-based area of computer science that aims to support reliable system development through axiomatic formalisation of their structure and functionality. At the time (and now even more) there was a great diversity of specification formalisms, each of them supported by a particular underlying logical system. Hence the need for a uniform approach to specification theory capable to develop those part of the theory that are independent of the choice of a particular logical system, and thus are common to many specification formalisms. The key step was the definition of the concept of institution in (Goguen and Burstall, 1984) intended to capture formally the structural essence of logical systems beyond specific details. Since semantics plays the primary role in formal specification, institutions lean towards the semantics side of logic, known as model theory. This aspect has permanently constituted a source of criticism from the side of syntactic and proof oriented logicians and at the same time a source of celebration from the side of the semantics oriented ones.

The concept of institution has two theoretical sources. One is the abstract model theory of Barwise (1974), and the other is the category theory of Eilenberg and Mac Lane (1945); Mac Lane (1998). While the importance of the latter in traditional areas of mathematics remains controversial, it has gained a major status in theoretical computing science (see (Goguen, 1991)) and logic. Although mathematically institutions are categorical structures
their spirit is that of abstract model theory.

b. Early Developments

The first paper on institution theory (Goguen and Burstall, 1984) introduced the main concepts and in parallel illustrated them on the example of the capture of many-sorted equational logic as an institution, at the time being the most traditional and important specification logic. Among the basics of institution theory developed there were the Galois connection between syntax and semantics as well as various concepts and results related to the structuring of specifications and programs. The latter task has been carried much further in the influential work (Sannella and Tarlecki, 1988). Another important early development was the introduction in (Tarlecki, 1986b) of the treatment of classical connectives (conjunction, disjunction, negation, and so forth) and of quantifications in abstract institutions, and also of other logic concepts such as interpolation. That was a clear indication that institution theory may reach far beyond its original goal, that of providing a very general theoretical platform for formal specification. The work (Tarlecki, 1986c) was the first paper on institution theory that developed deep results having a logic (model theory) rather than a computing science flavour. In parallel with these developments the list of logical systems captured as institutions kept growing, with quite unconventional ones being added, a process fueled mainly by the increasing diversity of computing science logics. Very often the effort to formally capture particular logical systems as institutions has lead to (re)considerations, within the respective logical setups, of some basic logical concepts, such as variable, language (or vocabulary, signature), model, sentence, and so forth. Presenting logical systems as precise and coherent mathematical objects proves to be more than a simple exercise, it has lead to new understanding of some aspects of particular logical systems.

c. Later Developments

The work (Meseguer, 1989) extended the concept of institution, that has a pronounced semantic character, to include proof-theoretic concepts, thus opening the possibility to have a general institution-theoretic approach to logical calculi.

Although the conceptual infrastructure was already in place right from the beginning of institution theory, the first substantial institution-theoretic work in the direction of doing logic by translation is (Cerioli and Meseguer, 1997). Many other works in the same direction followed, most notably (Mossakowski et al., 2009), the winner of the contest of the 2nd World Congress in Universal Logic (Xi’an, China, 2007).

An important trend within institution theory is less motivated by computing science and more by model theory research. Several important model theory methods have been developed at the level of abstract institutions and a lot of very general and yet deep results have been developed. This has resulted in a very abstract form of model theory, often refereed to as institution-independent model theory or synonymously institutional model theory. The monograph (Diaconescu, 2008) provides a snapshot of this dynamic area. Many of the institution-independent model theory results constitute high generalisations of well known results from conventional concrete model theory and can be used for obtaining easily corresponding results in less conventional logical systems. The same can be said for model theoretic concepts. A lot of theoretical computer science has been developed within institution theory based on the principle that formal specification and declarative programming languages should be based rigorously upon an underlying concrete institution. Based upon a large body of institution-theoretic developments, two modern specification languages have been designed by following this principle: CafeOBJ in Japan (Diaconescu and Futatsugi, 1998) and CASL in Europe (Astesiano et al., 2002). Both developments (the latter via the Hets environment (Mossakowski, 2005)) acknowledged the importance of logically heterogeneous environments, where several logical systems instead of only one are used via appropriate translations between them. For this, institution theory was able to accommodate a construction from algebraic geometry due to the French mathematician Alexandre Grothendieck, and flatten any such heterogeneous environment to a single institution (Diaconescu, 2002; Mossakowski, 2002), with the benefit of avoiding the rather big eort of a redevelopment of concepts and results for the heterogeneous situation. Institution theory plays the core role in the OMG standard The Ontology, Integration and Interoperability (OntoIOp).

d. Notes

Although (Goguen and Burstall, 1984) may be considered as the first prominent publication from the now rather vast institution theory literature, the seminal reference of the area is considered (Goguen and Burstall, 1992). The large time gap between these two publications is due to a very slow editorial process. Some critics consider the term ‘institution’ as uninspired since it does not convey anything about the scientific or the philosophical meaning of the concept. Goguen said that they chose this name, somehow half joking, in response to the sectarianism that was taking over the specification community at the time. Around particular specification formalisms, people were building real social institutions consisting of dedicated conferences, publication forums, user groups, and so forth.

2. The Concept of Institution

An institution is a mathematical structure that can be regarded as a template for capturing mathematically logical systems. Many argue that this template is general enough to accomodate anything that may be called ‘logic’, or at least any logical system based on satisfaction between sentences and models of any kind. The concept of institution relies heavily upon category theory concepts, but in a rather elementary way. This means that most of institution theory does not involve sophisticated or advanced levels of category theory. An institution consists of four kinds of entities: signatures, sentences, models, and the satisfaction between models and sentences. All these are considered fully abstractly and axiomatically. This means the focus is on their external properties, how they relate to the other entities, rather than what they actually are or may be.

a. Signatures

When assuming a logical context the first thing to be done is to assume a collection of symbols as primitive building blocks for the syntactic constructs. In logic this is usually called language or vocabulary. In computing science this is usually called signature, and so is in institution theory. In a signature the symbols are also usually structured, and often this gets to rather complex structures. This is especially true in the context of many modern computing science logics. But institution theory encapsulates all such information and treats signatures as fully abstract entities. In addition to this, institution theory considers that in a logical system signatures may vary. This comes from the practice of formal specification where the signatures are defined locally. Hence in any institution we have a collection of signatures rather than only one signature. However, this is not taken as a discrete collection as institution theory also considers interpretations between signatures called morphisms of signatures; these are also considered fully abstractly. The only data is that any signature morphism \varphi has a source signature \Sigma and a target signature \Sigma'; this is denoted \varphi \,\colon\; \Sigma \rightarrow \Sigma' by employing the common mathematical notation of a function. Note that in concrete examples signature morphisms are not necessarily functions, they may be much more complex interpretations that preserve the respective structure of the signatures. Also given two signatures, in general nothing prevents the existence of more than one morphisms between them, or the non-existence of such morphisms.

In the case of concrete institutions, of examples, one has to define precisely what are the signatures and their morphisms. Let us consider the simple example of (the institution of) classical propositional logic, which we denote by \mathit{PL}. Here a signature is just a set P, traditionally referred to as a set of ‘propositional variables’. It is important that P, although arbitrary, is a fixed set. Another set P' gives another signature. And any function \varphi \,\colon\; P \rightarrow P' is a signature morphisms. Note that the Boolean connectors \wedge, \neg, etc, although they contribute to the building of the propositional logic statements, are not part of the signatures.

The only property that institution theory considers for signatures and their morphisms is that when

    \[\varphi \,\colon\; \Sigma \rightarrow \Sigma'\]

and

    \[\varphi' \,\colon\; \Sigma' \rightarrow \Sigma''\]

there exists another signature morphism

    \[\varphi;\varphi' \,\colon\; \Sigma \rightarrow \Sigma''\]

called the composition of \varphi and \varphi'. This composition should satisfy some associativity and identity laws, hence very compactly we say that in any institution the signatures and their morphisms form a category. In our \mathit{PL} example this is just the most standard category, the category \mathit{\mathrm{SET}} of sets with functions as morphisms, with the composition \varphi;\varphi' being just the set theoretic composition \varphi' \circ \varphi, i.e.

    \[(\varphi;\varphi')(x) = (\varphi' \circ \varphi)(x) = \varphi' (\varphi (x))\]

b. Sentences

In any particular logical system once a language (signature) is assumed, we can have logical statements or sentences. The collection of sentences is dependent on the assumed language (signature). In institution theory this very basic principle is reflected by a designated function (\mathrm{Sen}) that maps each signature to a set (of presumed sentences). At the abstract level one does not bother what this function is, it is considered fully abstractly. For each signature \Sigma we just call the elements of \mathrm{Sen}(\Sigma) as ‘\Sigma-sentences’. However, the concrete institutions need to define these. For instance, given a \mathit{PL} signature P (which is just a set), \mathrm{Sen}(P) is the set of all formulæ built from P by using the connectors \wedge and \neg. If

    \[P = \{ \pi_1, \pi_2 \}\]

then for example

    \[(\neg \pi_1) \wedge \pi_2 \in \mathrm{Sen}(P)\]

Since semantically the other propositional logic connectors such as implication \Rightarrow, disjunction \vee, etc. can be expressed in terms of \wedge and \neg, the latter two are enough to get all the expressivity power of propositional logic.

The signature morphisms reflect at the level of sentences as translation mappings. That is, for any signature morphism \varphi \,\colon\; \Sigma \rightarrow \Sigma' there exists a function

    \[\mathrm{Sen}(\varphi) \,\colon\; \mathrm{Sen}(\Sigma)\rightarrow \mathrm{Sen}(\Sigma')\]

For abstract institutions \mathrm{Sen}(\varphi) is considered abstractly, but in concrete institutions we have to define it. Usually \mathrm{Sen}(\varphi) just replaces the symbols in \Sigma-sentences according to \varphi. For example, given the only existing \mathit{PL} signature morphism

    \[\varphi \,\colon\; P \rightarrow P'\]

where P is as above and P' = \{ \pi \} then

    \[\mathrm{Sen}(\varphi)((\neg \pi_1)\wedge \pi_2) = (\neg \pi) \wedge \pi\]

. In other situations things are not that straightforward. For instance in many sorted quantified logics the translation of quantified sentences gets a little sophisticated due to the management of the first order variables (for example to make sure that a translated variable does not clash with an existing constant of the target signature). From our example we may note easily a very simple coherence property of the sentence translation: for any composition of signature morphisms \varphi;\theta we have that for each sentence \rho,

    \[\mathrm{Sen}(\varphi;\theta)(\rho) = \mathrm{Sen}(\theta)(\mathrm{Sen}(\varphi)(\rho))\]

If we add also that the identity signature morphisms (i.e. those keeping everything as it is) always get mapped to identity functions then in category theory terminology we can just say that \mathrm{Sen} is a functor from the category \mathrm{Sig} of the signatures to the category \mathit{\mathrm{SET}} of sets and functions. At the level of abstract institutions this property of the \mathit{PL} sentence translations is given as an axiom that characterises the sentence part of the general definition of institution.

c. Models

On the semantics shore, each signature can be interpreted as a collection of models. In general for each signature \Sigma there can be several models (called \Sigma-models), an aspect that gives model theory its relativistic character. For instance, the models of a \mathit{PL} signature P are the valuations of the ‘propositional variables’ of P to the truth values 0 and 1, which is the same as the subsets of P (by retaining only the elements of P that are valuated to 1). Like with signatures and sentences, at the abstract level, that of the definition of the concept of institution, the models are also considered fully abstractly. Like for signatures, but unlike for sentences (of a given signature), we also consider morphisms between models. This yields the same kind of mathematical structure for the collection of the \Sigma-models like in the case of \mathrm{Sig}, that of a category. Hence let \mathrm{Mod}(\Sigma) denote the category of the \Sigma-models and their morphisms. In the case of \mathit{PL}, \mathrm{Mod}(P) has the particularity that given two models M and N there exists at most one morphism M \rightarrow N; this happens only when M \subseteq N (here we regard the P-models as subsets of P rather than valuations to \{ 0,1 \}).

The fact that in general \mathrm{Mod}(\Sigma) is defined as a category rather than set (like \mathrm{Sen}(\Sigma) is), besides the morphisms aspect, has a rather subtle set-theoretic aspect. The \Sigma-models may be so many that they may not constitute a set anymore. In examples this is closely related to the fact that there does not exists ‘the set of all sets’, which would be a violation of one of the axioms of formal set theory. However, this does not necessarily happen always, for example in \mathit{PL} the P-models do form a set.

Another important difference between the syntax and semantics sides of the definition of institution occur at the level of the translations induced by the signature morphism. This phenomenon can be best understood when looking at concrete examples, such as \mathit{PL}. Given a \mathit{PL} signature morphism \varphi \,\colon\; P \rightarrow P' the corresponding translation mapping goes from \mathrm{Mod}(P') to \mathrm{Mod}(P), opposite the direction the sentence translation goes. Namely, each P'-model M' gets reduced to the P-model M' \circ \varphi (here for convenience we regard \mathit{PL}-models as valuations rather than subsets). This is a very important feature of the semantics, called the contravariance of the reduction. The use of the name ‘reduction’ instead of ‘translation’ is in fact meant to convey this aspect. This terminological choice can be easily understood when considering \varphi to be an inclusion P \subseteq P': as the valuation function, M' \circ \varphi is just the restriction of M' to P. Thus, at the general level, for any signature morphism \varphi\,\colon\; \Sigma \rightarrow \Sigma' there is a model reduct functor

    \[\mathrm{Mod}(\varphi)\,\colon\; \mathrm{Mod}(\Sigma')\rightarrow \mathrm{Mod}(\Sigma)\]

By taking into account that \mathrm{Mod}(\Sigma') and \mathrm{Mod}(\Sigma) are categories rather than sets, \mathrm{Mod}(\varphi) is functor rather than function. The behaviour of the model reduct functors with respect to the composition of the signature morphisms is perfectly similar to what happens in the case of the sentence translations, of course modulo the contravariance aspect:

    \[\mathrm{Mod}(\varphi;\varphi')(M'') = \mathrm{Mod}(\varphi)(\mathrm{Mod}(\varphi')(M''))\]

All these can be formulated compactly just by saying that \mathrm{Mod} is a contravariant functor from \mathrm{Sig} to the category \mathit{\mathrm{CAT}} of categories and functors.

d. Satisfaction

At the heart of the semantic concept of truth, promoted by Tarski (1944) and employed by institution theory, lies the satisfaction relation between models and sentences. For example, in \mathit{PL} given a signature P, a P-model M and a P-sentence \rho, we may evaluate \rho for the valuation M as a truth value 0 or 1 by applying the well known truth table of classical propositional logic inductively on the structure of \rho. Then we say that M satisfies \rho, written M \models \rho, if and only if the evaluation of \rho in M yields 1. Note that we speak about satisfaction only when the model and the sentence belong to the same signature.

Since in the definition of institutions signatures, sentences and models are fully abstract, the satisfaction of sentences by models is fully abstract too. Thus for each signature \Sigma there is a satisfaction relation between \Sigma-models and \Sigma sentences, denoted \models_\Sigma. This is subject of an axiom known as the Satisfaction Condition whose meaning is often informally explained as the invariance of truth with respect to the change of notation: for each signature morphism \varphi \,\colon\; \Sigma\rightarrow \Sigma', each \Sigma-sentence \rho and each \Sigma'-model M',

    \[ \mathrm{Mod}(\varphi)(M') \models_\Sigma \rho \text{ if and only if } M' \models_{\Sigma'} \mathrm{Sen}(\varphi)(\rho).\]

In concrete institutions one always has to define the satisfaction relation, commonly done by induction on the structure of the sentences (like in the case of \mathit{PL}$. This principle, known as truth functionality, means that given a semantic context (model) the truth value of any compound sentence is determined from the truth values of its components. Truth functionality provides also the common method to establish the Satisfaction Condition in concrete institutions, by induction on the structure of the sentences. While in some cases this is an easy task (\mathit{PL} is such an example) in other cases it can be highly non-trivial. Especially in the case of quantified logics the induction step corresponding the quantifications poses some technical problems, requiring some compositionality property for the models (see (Diaconescu, 2008)).

Our presentation of the mathematical definition of the concept of institution may be summarised as follows. An institution is a tuple (\mathrm{Sig},\mathrm{Sen},\mathrm{Mod},\models) where \mathrm{Sig} is a category, \mathrm{Sen} is a functor \mathrm{Sig} \rightarrow \mathit{\mathrm{SET}}, \mathrm{Mod} is a contravariant functor \mathrm{Sig} \rightarrow \mathit{\mathrm{CAT}}, and for each signature \Sigma, \models_\Sigma is a relation between \Sigma-models and \Sigma-sentences such that the Satisfaction Condition holds. Note that besides the Satisfaction Condition, the definition of institution includes other axioms encapsulated by the several categories and functors that are part of the definition.

e. Concrete Institutions

The institution theory literature (which includes part of the specification theory literature) contains countless examples of logical systems that have been formally captured as institutions. Among these there are first, second, higher order logics, logics with some form of partiality for the functions such as partial algebra and various dialects of order sorted algebras, non-classical logics such as intuitionistic ones, a wide diversity of modal logics, fuzzy and many valued logics, and so forth. All these institutions admit also many-sorted variants. Many examples of institutions arose on the basis of various combinations between other institutions. Several institutions related to programming look rather divorced from the common perception of what is a logical system, some of these being presented in (Sannella and Tarlecki, 2012).

f. Notes

The traditional style of doing logic has a rather global approach to signatures, they usually do not vary, and when they do this would be just extensions. By contrast, institution theory is genuinely multi signature oriented, with signature morphisms being a rather primitive concept. Moreover in concrete examples the institution-theoretic view is that these can be broader than extensions, they may rename and even collapse elements. This widening of the concept of signature morphism serves the purpose of specification theory but is also very convenient for abstraction. Having signature morphisms as a primitive concept is also crucial for the development of various important logic concepts at the abstract level, for example, quantification, interpolation, method of diagrams, saturated models, and so forth. Moreover, the generality of the concept of signature morphism in institution theory accommodates even first-order substitutions like in (Găină and Petria, 2010), or substitutions of propositional variables by compound propositions like in (Voutsadakis, 2002).

The Satisfaction Condition had appeared for the first time as an axiom in (Barwise, 1974), but in a much less abstract context than institution theory. Although in logic this appears as an indisputable property of logical systems, in computing science there was some criticism of being too strong and thus preventing some logical systems, albeit very marginal, from being institutions. More precisely, the critics of the Satisfaction Condition argued that the implication from the right to the left would be enough. Counter-criticism to this argues that in the absence of the full Satisfaction Condition as an equivalence, almost all general institution theory results both in model theory and in computing science become impossible. In defence of the Satisfaction Condition (Goguen, 1991) argues that those counterexamples arise due to some heavy incoherence between the respective concepts of signature morphism and satisfaction relation, that under a meaningful fix of the respective concept of signature morphism the Satisfaction Condition is rescued.

3. Institution-independent Model Theory

The institution-theoretic approach to model theory tends to be rather comprehensive, here we present some rudiments of it. This reading of this section may sometimes require some technical inclination from the side of the reader.

a. The Galois Connection between Syntax and Semantics

Given a signature \Sigma in an institution we let

  • for each E\subseteq \mathrm{Sen}(\Sigma), E^* = \{ M \in \mathrm{Mod}(\Sigma) \mid M \models_\Sigma \rho \mbox{ for each } \rho\in E \}, and
  • for each \mathcal{M}\subseteq \mathrm{Mod}(\Sigma), \mathcal{M}^* = \{ \rho\in \mathrm{Sen}(\Sigma) \mid M \models_\Sigma \rho \mbox{ for each } M \in \mathcal{M} \}.

These give what is called a Galois connection between the subsets of \mathrm{Sen}(\Sigma) and those of \mathrm{Mod}(\Sigma). The \(* operators allow also for the definition of semantic consequence: given any set E of \Sigma-sentences and any \Sigma-sentence \rho we say that \rho is a semantic consequence of E, written E \models \rho, when \rho \in E^{**} (that is every model that satisfies every sentence in E satisfies \rho too).

b. Logical Connectors

The semantics of the Boolean connectors can be formally defined in institutions also by using the * operators. A \Sigma-sentence \rho is a conjunction of the \Sigma-sentences \rho_1 and \rho_2 when \rho^* = \rho_1^* \cap \rho_2^*. Or, \rho' is a negation of \rho when \rho'^* = \mathrm{Mod}(\Sigma) \setminus \rho^*. And similarly for disjunction, implication, etc. Note that, unlike their syntactic correspondents that are unique, the semantic conjunction, negation, implication, etc. are unique only up to semantic equivalence, which means that from this point of view sentences satisfied by the same models are indistinguishable. The definition of the Boolean connectors can be extended at the level of the whole institution: for example an institution has conjunctions when any two of its sentences have a conjunction, and so on.

c. Quantifiers

The institution theoretic treatment of quantifiers is semantic and relies crucially upon the concept of signature morphism essentially by assimilating valuations of variables with model expansions along the extension of the signature with the respective variables. While in concrete institutions one may discuss about valuations of variables X into models M as functions from X to some underlying set theoretic carrier of M, in the abstract setup this is not possible due to the lack of explicit set theoretic structures. However by assimilating the variables X with the signature extensions \Sigma \rightarrow \Sigma+X obtained by adding them to the respective signatures, we may note that for any \Sigma-model M, there is a canonical one-to-one correspondence between the valuations X \rightarrow M and the \Sigma+X-models M' such that their reducts to \Sigma are just M. This trick implies that variables can become part of the signatures, which breaks with the habit of traditional approaches to logic of keeping variables separated from the language (signature). The point of this separation is to avoid some clash between X and the entities of \Sigma. However this can be achieved differently without separating variables from the signatures, which anyway from the formal perspective poses several difficulties. The idea is very simple and comes from the practice of specification languages. One has just to qualify properly the variables by their signature context. For example a variable for a signature \Sigma in the institution of many sorted first order logic would be a triple (x,s,\Sigma) where x is the name, s the sort/type, and \Sigma the signature of the variable. A signature \Sigma becoming part of the data defining the variables X prevents, by formal set theory reasons, any clash between \Sigma and X when adjoining X to \Sigma.

Hence at the level of abstract institutions a variable for a signature \Sigma is just a signature morphism \chi \,\colon\; \Sigma \rightarrow \Sigma. Note that in the concrete situations variables-as-signature-morphisms support concepts of variables at the same level with the signatures, hence it is rather powerful. For example if the signature allows for higher order functions, then one can have variables for those. However often the intended variables are significantly more particular than what the respective concept of signature provides. This is one of the reasons why not any signature morphism can be considered as representing a variable. Another reason is that in concrete institutions the signature morphism representing variables are extensions, and moreover they are extensions with entities that have a proper qualification, as discussed above. A standard way to realize these restrictions abstractly is to have an abstract designated subclass \mathcal{D} of signature morphism as variables. Given (\chi \,\colon\; \Sigma \rightarrow \Sigma') in \mathcal{D}, a \Sigma-sentence \rho is a universal \chi-quantification of a \Sigma'-sentence \rho' when for each \Sigma-model M, M \models_\Sigma \rho if and only if M' \models_{\Sigma'} \rho' for all \Sigma'-models M' such that M = \mathrm{Mod}(\chi)(M'). Existential quantification is defined similarly. It is said that the institution has universal/existential \mathcal{D}-quantifications when any sentence \rho' as above has a universal/existential \chi-quantification for any \chi\in \mathcal{D} as above.

In logic in general and in model theory in particular it is well known that quantification by first-order variables has very good and desirable properties. This kind of quantification is captured at the abstract level by requiring that any signature morphism (\chi \,\colon\; \Sigma \rightarrow \Sigma') in \mathcal{D} is representable, which essentially means that there exists a \Sigma-model M_\chi such that there is a one-to-one correspondence between the \Sigma'-models N' with \mathrm{Mod}(\chi)(N') = N and the \Sigma-model morphisms M_\chi \rightarrow N. The idea behind this is that in concrete situations valuations of variables can be represented by model morphisms from a ‘free model’ over the variables. In other words, when \chi is a signature extension \Sigma \rightarrow \Sigma + X as discussed above, M_\chi is the \Sigma-model freely generated by X, in standard actual situations being a term model over X.

d. Diagrams

The method of diagrams is a widely pervasive method in model theory (see for example (Chang and Keisler, 1990)). Its institution-theoretic abstraction of Diaconescu (2004b) also plays an important role in many institution-independent model theory developments. In essence, the diagram of given a model M represents a kind of comprehensive syntactic characterisation of M. The institution-theoretic definition introduced in (Diaconescu, 2004b) axiomatises a category theoretic one-to-one correspondence between the model morphisms M \rightarrow N and the models satisfying an abstractly designated set of sentences E_M. The signature Sigma_M of E_M is not the signature \Sigma of the model M, instead there is an abstractly designated signature morphism

    \[\iota_{\Sigma,M} \,\colon\; \Sigma \rightarrow \Sigma_M\]

In the concrete examples in which the models have an underlying set theoretic structure, \iota_{\Sigma,M} is usually the extension of \Sigma with the elements of the underlying carrier of M that are adjoined to \Sigma as new constants.

The most typical examples come from classical first order model theory. Let \iota_{\Sigma,M} be as described above, and let M_M be the \Sigma_M model that expands M by interpreting the new constants by themselves. Then E_M is the set of all atoms that are satisfied by M_M. This corresponds to the case when the model morphisms are those that preserve the structure of the models, if we change the concept of model morphism then the diagram should change also. For example if we consider the elementary embeddings as model morphisms then E_M is much larger, consisting of all first order sentences that are satisfied by M_M. And so on. The narrower the class of model morphisms considered, the larger the corresponding diagrams.

The existence of institution-theoretic diagrams in concrete logical systems is a mark of coherence between the syntax (the kind of sentences involved) and the semantics (the concept of model morphism employed). For example non-hybrid modal logics lack diagrams because there is an unbalance between the syntax and the semantics (Kripke structures), in the sense that the syntax is too weak to express aspects of the semantics. On the other hand, the hybrid (modal) logics do have diagrams because they have more expressive power with respect to the Kripke frames.

e. Ultraproducts

The method of ultraproducts is a most important and remarkably powerful one in conventional model theory (Chang and Keisler, 1990). This has been realised at the abstract level of institution theory beginning with (Diaconescu, 2003), on the basis of the previously established concept of categorical ultraproduct (introduced perhaps first time in (Matthiessen, 1978)) and applied to the categories of models \mathrm{Mod}(\Sigma) in institutions. This requires some familiarity with categorical limits and colimits. A filter F on a set I is a set of subsets of I, such that J \cap J' \in F when J,J' \in F and J'\in F when J\in F and J \subseteq J'. F is an ultrafilter when in addition F satisfies the property that for each J\subseteq I, J \in F if and only if its complementary I \setminus J \not\in F. Let us consider a family (M_i)_{i\in I} of \Sigma-models in an institution and a filter F on I. For any J\in F let us denote the direct product of (M_i)_{i\in J} by M_J. If \mathrm{Mod}(\Sigma) has direct products then for any J \subseteq J' in F there is a canonical projection

    \[p_{J'\supseteq J} \,\colon\; M_{J'} \rightarrow M_J\]

Then any colimit

    \[\mu = \{ \mu_J \,\colon\; M_J \rightarrow M_F \mid J \in F \}\]

of the diagram

    \[\{ p_{J'\supseteq J} \mid J\subseteq J', J\in F \}\]

is called an F-product of (M_i)_{i\in I}. When F is ultrafilter, F-products are called ultraproducts. Due to the uniqueness up to isomorphisms of categorical direct products and of colimits it follows that F-products are unique only up to isomorphisms too.

The foundation of the ultraproducts method in first-order model theory is constituted by a result in (Ło´s, 1955) which gives a ‘preservation’ property for the satisfaction by ultraproducts of models that is common to all sentences in first-order logic. This has been highly generalised in institution theory by decomposing it into a puzzle of general preservation results across connectors (Boolean, modalities) and quantifiers. Concrete instances of this result and of some of its extensions provide for free an ultraproducts method for a variety of logical systems, including unconventional ones for which such development was otherwise difficult to envisage.

A typical application of ultraproducts is the derivation of the compactness of the semantic consequence without having to resort to a proof system and a related completeness argument, which in general is technically very difficult. An institution is said to be compact when for each semantic consequence E \models_\Sigma \rho, where E is a set of sentences and \rho is a single sentence, there exists a finite set E_0 \subseteq E such that E_0 \models_\Sigma \rho. In the presence of the preservation of the satisfaction relation under the ultraproduct construction of models, if in addition the institution has conjunctions and negations, we get the compactness of the institution. When adjoined to the institution-theoretic generalisation of the fundamental ultraproducts result of (Ło´s, 1955), this remarkably general property gives an abstract compactness result, which can be instantiated with little eort to a wide variety of concrete logical systems. The eciency of this path to compactness has become transparent for example in (Diaconescu and Stefaneas, 2007) in the case of a wide class of quantified modal systems.

f. Interpolation and Definability

Interpolation is an important property of logical systems which has a strikingly elementary formulation but which is usually very difficult to establish. Its common semantic version sounds like this:

given a \Sigma_1-sentence \rho_1 and a \Sigma_2-sentence \rho_2, if \rho_1 \models_{\Sigma_1 \cup \Sigma_2} \rho_2 then there exists a \Sigma_1 \cap \Sigma_2-sentence \rho_0 (called interpolant) such that \rho_1 \models_{\Sigma_1} \rho_0 and \rho_0 \models_{\Sigma_2} \rho_2.

In other words, any semantic consequence is established by means of symbols that are common to both sentences. In institution theory interpolation is considered in a form that generalizes several aspects of its common formulation. First, the signature inclusions that appear implicitly in the formulation of interpolation are abstracted to arbitrary signature morphisms. Then the common formulation of interpolation corresponds to the situation when the institution comes with the signature morphisms restricted to inclusions only. However the generalisation of interpolation to arbitrary signature morphism allows in the concrete situations for consideration of signature morphisms that may rename or even collapse syntactic entities. While such extended form of interpolation may be unusual in conventional logic, it is used in specification theory. A second generalisation of the concept of interpolation replaces individual sentences by finite sets of sentences. While in logics that have conjunction (such as classical propositional, first-order logics, and so forth) this does not mean anything, it is very meaningful in logics lacking conjunctions, such as equational or Horn clause logics. In the latter ones interpolation may fail artificially due to unrealistic single sentence style formulation. These get us to the following definition of interpolation. A commuting square of signature morphisms like below

graphic-1

 
is an interpolation square when for any finite sets E_1\subseteq\mathrm{Sen}(\Sigma_1) and E_2\subseteq\mathrm{Sen}(\Sigma_2) such that

    \[\mathrm{Sen}(\theta_1)(E_1) \models_{\Sigma'} \mathrm{Sen}(\theta_2)(E_2)\]

there exists a finite set

    \[E_0 \subseteq\mathrm{Sen}(\Sigma_0)\]

such that

    \[E_1 \models_{\Sigma_1} \mathrm{Sen}(\varphi_1)(E_0)\]

and

    \[\mathrm{Sen}(\varphi_2)(E_0)\models_{\Sigma_2} E_2\]

Such commuting squares are meant to emulate the intersection-union of signatures from the common formulation of interpolation, with \Sigma_0 in the role of \Sigma_1 \cap \Sigma_2 and \Sigma' in the role of \Sigma_1 \cup \Sigma_2. However in the common formulation of interpolation it is important that \Sigma_1 \cup \Sigma_2 is the lowest signature above \Sigma_1 and \Sigma_2. In the generalised formulation of interpolation this property appears as a category theoretic condition, that the commuting square is a pushout in the category \mathrm{Sig} of the signatures of the institution. But when considering interpolation as a property of the institution as a whole, it is in general not meaningful to look for interpolation in all pushout squares. This leads to another generalisation layer in the formulation of interpolation, which restricts abstractly the range of \varphi_1 and \varphi_2 to designated subclasses of signature morphisms. Thus, given \mathcal{L} and \mathcal{R} subclasses of signature morphisms, the institution has (\mathcal{L},\mathcal{R})-interpolation when each pushout of a span (\varphi_1,\varphi_2) of signature morphisms with \varphi_1\in \mathcal{L} and \varphi_2\in \mathcal{R} is an interpolation square.

Institution-theoretic interpolation has been established at the general level in relation to several different causes. One cause can be an axiomatizability property of the institution, like in (Diaconescu, 2004a). A typical example here is many sorted equational logic which has (\mathcal{L},\mathcal{R})-interpolation with \mathcal{L} being the injective signature morphisms and \mathcal{R} being all signature morphisms. Another cause can be the Robinson consistency property, like in (Găină and Popescu, 2007). An instance of this is many sorted first order logic which has (\mathcal{L},\mathcal{R})-interpolation when either \mathcal{L} of \mathcal{R} consists of signatures morphisms that are injective on the sorts. And yet another cause can be the existence of an adequate translation to an institution that has well established interpolation properties, like in (Diaconescu, 2012b).

Definability is one of the traditional important consequences of interpolation. The concept of definability has been approached in institution theory by Petria and Diaconescu (2006) in a rather similar way to interpolation, by abstracting the new (presumably definable) symbols for signatures to arbitrary abstract signature morphisms. The institution-theoretic study of definability in (Petria and Diaconescu, 2006) revealed that, besides interpolation, axiomatizability properties (considered in the generalised form introduced in (Diaconescu, 2004a)) may constitute a primary cause for definability.

g. Layered Completeness

The concept of institution can be enhanced with a proof theoretic structure (Meseguer, 1989) by adding for each signature \Sigma an entailment relation \vdash_\Sigma between sets of \Sigma-sentences and single \Sigma-sentences subject to the following axioms:

  • E \vdash_\Sigma \rho when \rho\in E;
  • If E \vdash_\Sigma \gamma for each \gamma\in \Gamma and \Gamma \vdash_\Sigma \rho then E \vdash_\Sigma \rho; and
  • For each signature morphism \varphi \,\colon\; \Sigma \rightarrow \Sigma' if E \vdash_\Sigma \rho then \mathrm{Sen}(\varphi)(E) \vdash_{\Sigma'} \mathrm{Sen}(\varphi)(\rho).

The entailment system \vdash is sound for the respective institution when E \vdash_\Sigma \rho implies E \models_\Sigma \rho and is complete when the reverse implication holds. These are institution-theoretic generalisations of the common concepts of soundness and completeness from logic and model theory. Soundness is an obligatory property that is also easy to establish in the concrete situations. Completeness is very desirable but hard to establish; completeness results are difficult ones.

The institution-theoretic approach to completeness results is a layered one based upon the observation that usually proof systems can usually be deconstructed into several layers that correspond to the structure of the sentences involved, and that completeness can be developed at each layer relative to the completeness at the previous one. This means that at each layer the respective entailment system is considered fully abstract, and hence the proof rules that build the next layer come in a form that is independent of the previous layers. The total completeness
result is thus obtained as a combination of smaller independent completeness results, each of them having the potential to be reused in other contexts.

For instance, let us consider the case of the completeness of Horn clause logic with equality, the fragment of first order logic with equality that restricts the sentences to those of the form (\forall X)H \Rightarrow C, where H is a finite conjunction of (equational and relational) atoms and C is a single atom. This completeness can be decomposed into three layers. At the base layer we consider the institution that has as sentences only equational t=t' and relational \pi(t_1,\dots,t_n) atoms. For that we have a complete proof system defined by

  • \vdash t=t for each term t;
  • t =t' \vdash t' = t for all terms t,t';
  • \{ t=t', t'=t'' \} \vdash t = t' for all terms t,t',t'';
  • \{ t_i = t'_i \mid 1\leq i\leq n \} \vdash \sigma(t_1,\dots,t_n) = \sigma(t'_1,\dots, t'_n) for any function symbol \sigma of arity n; and
  • \{ t_i = t'_i \mid 1\leq i\leq n \} \cup \pi(t_1,\dots,t_n) \vdash \pi(t'_1,\dots,t'_n) for any relation symbol \pi or arity n.

At the next layer we add the sentences of the form H \Rightarrow C where H is a finite conjunctions of atoms and C is a single atom and extend the proof system with the meta-rule

  • \Gamma \cup H \vdash C if and only if \Gamma \vdash H \Rightarrow C.

The crucial point here is that this step does not depend upon the previous layer, meaning that information can be completely encapsulated, allowing the addition of implications over an arbitrary institution endowed with a complete entailment system. Then the completeness at the new layer is obtained. The final layer consists of adding the universal quantification to the sentences and a rule and a meta-rule to the proof system.

  • (\forall X)\rho \vdash (\forall Y)\theta(\rho) for any substitution \theta of variables X with terms over the variables Y; and
  • \Gamma \vdash _\Sigma (\forall X)\rho if and only if \Gamma \vdash_{\Sigma+X} \rho.

This step can be also developed independently of the previous layer, over an institution endowed with a complete entailment system considered fully abstractly. This requires however an abstract treatment of substitutions. The completeness at the final upper layer is also obtained. By instantiating now the base layer to the one of the atoms in first-order logic with equality and the final layer to universal first-order quantification we get a complete proof system for Horn clause logic with equality. But all the compound completeness results can be also used separately on different instances of the abstract institutions, thus obtaining complete proof systems in various different logical systems. For example we can obtain a complete proof system for the universal fragment of first-order logic, that is, sentences of the form (\forall X)\rho where \rho is a quantifier-free sentence, just by replacing the mid layer above by the proof system of propositional logic.

h. Notes

The institution-theoretic concept of diagrams introduced by Diaconescu (2004b) is significantly simpler than a previously introduced one in (Tarlecki, 1986a,c). Since (Diaconescu, 2004b) this has been used rather intensively in many institution theory works.

The many-sorted first-order logic instance of the general interpolation result of (Găină and Popescu, 2007) represents an elegant solution by means of institution theory to a question about the limits of interpolation in many-sorted first-order logic that stayed as conjecture for several years. The layered completeness was introduced by Borzyszkowski (2002) within the context of the study of complete calculi for structured specifications. The example discussed here comes from Codescu and Găină (2008). Other works on layered completeness include (Găină and Petria, 2010) (two layers), (Găină et al., 2012) (four layers).

Other important model theory methods that have been developed at the level of abstract institutions include saturated models (Diaconescu and Petria, 2010), forcing (Găină and Petria, 2010), omitting types (Găină, 2014). These have lead to remarkable high generalisations of deep model theory results including downwards Löwenheim-Skolem in (Găină, 2014) and the Keisler-Shelah characterisation of first-order elementary equivalence as isomorphism under ultrapowers in (Diaconescu and Petria, 2010).

4. Logic by Translation

Institution theory is very well positioned with respect to the logic-by-translation paradigm because of its perspective on logical systems as mathematical objects/structures. Concepts of structure preserving mappings between institutions, when regarded as mathematical structures, constitute mathematical formalisations for translation concepts. The importance of this idea has been recognised right from the beginnings of institution theory.

a. Morphisms and Comorphisms

Institutions as mathematical structures invite several concepts of ‘morphisms’, or structure preserving mappings between institutions. All of them define three components, corresponding to the translations of the signatures, of the sentences, and of the models. Moreover in each case an axiom that represents an invariance property of the satisfaction relation with respect to these translations is imposed.

The original structure preserving mapping between institutions has been defined by Goguen and Burstall (1992), and called just morphism of institutions. Given institutions \mathcal{I} and \mathcal{I}' a morphism \mathcal{I}' \rightarrow \mathcal{I} consists of

  • a functor \Phi \,\colon\; \mathrm{Sig}' \rightarrow \mathrm{Sig} translating \mathcal{I}'-signatures to \mathcal{I}-signatures;
  • for each \mathcal{I}'-signature \Sigma' a sentence translation \alpha_{\Sigma'} \,\colon\; \mathrm{Sen}(\Phi(\Sigma')) \rightarrow \mathrm{Sen}'(\Sigma')
  • for each \mathcal{I}'-signature \Sigma' a model translation/reduct \beta_{\Sigma'} \,\colon\; \mathrm{Mod}'(\Sigma') \rightarrow \mathrm{Mod}(\Phi(\Sigma'))

such that both \alpha and \beta are natural transformations (this is a category theory notion, ‘naturality’ in this context meaning a coherence property of the component translations with respect to the signature morphisms) and such that the following Satisfaction/Translation Condition holds for each \Sigma'-model M' and each \Phi(\Sigma')-sentence \rho: ! M' \models'_{\Sigma'} \alpha_{\Sigma'}(\rho) \mbox{ if and only if } \beta_{\Sigma'}(M') \models_{\Phi(\Sigma')} \rho.

Institution morphisms have the flavour of ‘projecting’ from a more complex institution to a simpler one, like the following morphism from first-order to propositional logic. This maps any first-order logic signature to its set of sentences (each sentence being regarded as a propositional variable in a propositional logic signature), \alpha_{\Sigma'} (\rho) is just the first-order sentence \rho, and \beta_{\Sigma'} (M') being just the propositional logic model consisting of all \Sigma'-sentences that hold in M'.

By reversing the translation of the signatures we get the concept of comorphism of institutions which has the flavour of an ‘embedding’ of a simpler institution into a more complex one. A comorphism \mathcal{I} \rightarrow \mathcal{I}' consists of

  • a functor \Phi \,\colon\; \mathrm{Sig} \rightarrow \mathrm{Sig}' translating \mathcal{I}-signatures to \mathcal{I}'-signatures;
  • for each \mathcal{I}-signature \Sigma a sentence translation \alpha_{\Sigma} \,\colon\; \mathrm{Sen}(\Sigma) \rightarrow \mathrm{Sen}'(\Phi(\Sigma)); and
  • for each \mathcal{I}-signature \Sigma a model translation/reduct \beta_{\Sigma} \,\colon\; \mathrm{Mod}'(\Phi(\Sigma)) \rightarrow \mathrm{Mod}(\Sigma).

such that both \alpha and \beta are natural transformations and the following Satisfaction/Translation Condition holds for each \Phi(\Sigma)-model M' and each \Sigma-sentence \rho:

    \[ M' \models'_{\Phi(\Sigma)} \alpha_{\Sigma}(\rho) \mbox{ if and only if } \beta_{\Sigma}(M') \models_{\Sigma} \rho.\]

The embedding of propositional logic into first-order logic can be captured as a comorphism that interprets any set of propositional variables as a first-order signature that has only relation symbols of arity zero; note that both \alpha and \beta are identities.

The level of preservation of institution-theoretic structure by morphisms and comorphisms is equal, so just from the viewpoint of institutions as mathematical structures one cannot say which of these is more adequate to play the role of morphisms for the category of institutions. This means that there can be at least two categories that have institutions as their objects, both of them equally legitimate from the perspective of structure. This constitutes a good example of the idea, prevalent in category theory, that a category is best described by its morphisms rather than its objects. The conceptual symmetry between institution morphisms and comorphisms got a formal explanation by Arrais and Fiadeiro (1996) where it is showed that a categorical adjunction between the categories of the signatures of two institutions \mathcal{I} and \mathcal{I}' determine a canonical one-to-one correspondence between the morphisms \mathcal{I}' \rightarrow \mathcal{I} and the comorphisms \mathcal{I} \rightarrow \mathcal{I}'.

b. Encodings

Besides embeddings of simpler logics into more complex ones, the concept of comorphism supports also ‘encodings’ of more complex logics into simpler ones. Famous cases such as the translation of classical propositional logic into intuitionistic by Kolmogorov (1925) or the standard translation of modal logic into first-order logic by van Bentham (1988) can be presented as comorphisms. Most often the cost representing the difference in complexity is payed at the level of translation of the signatures, \Phi mapping signatures to theories in the target institution. This can be explained as a comorphism \mathcal{I} \rightarrow \mathcal{I}'^{\mathrm{th}} where \mathcal{I}'^{\mathrm{th}} denotes the canonically defined institution of \mathcal{I}'-theories, that has as signatures the pairs (\Sigma,E) where \Sigma is an \mathcal{I}'-signature and E a set of \Sigma-sentences. (Note that ‘theories’ here are mere sets of sentences rather than sets of sentences closed under some consequence relation as meant in some logical studies.) The comorphism \mathcal{I} \rightarrow \mathcal{I}'^{\mathrm{th}} are sometimes called theoroidal comorphisms. The two encodings mentioned above represent a notable exception, as they can be presented as plain rather than theoroidal comorphisms (Goguen and Ro¸su, 2002), in both cases the encoding cost being payed at the level of the translation of the sentences.

c. Borrowing

Comorphism-based encodings between institutions represent the main tool for the institution theoretic approach to the logic-by-translation paradigm. Suppose we want to establish a property P in an institution \mathcal{I} but due to various reasons this is rather difficult. Then we can look for a suitable encoding \mathcal{I} \rightarrow \mathcal{I}' such that the translation of P along the encoding can be established in \mathcal{I}', and moreover such that we can reflect back to \mathcal{I} this conclusion in the form of P. In this case we say that P is ‘borrowed’ from \mathcal{I}' along the encoding. This requires the respective encoding to satisfy some specific properties that are conducive for the borrowing process. The institution theory literature abounds of such examples, that include interpolation, definability, diagrams, saturated models, etc.

One especially important case is that of the borrowing of a consequence relation. Let P be a consequence E \models_\Sigma \rho to be established. If it holds then by a simple argument relying upon the Satisfaction Condition of the comorphism we have that

    \[\alpha_\Sigma (E) \models_{\Phi(\Sigma)} \alpha_\Sigma (\rho)\]

holds too. If \mathcal{I}' comes equipped with a complete proof system then

    \[\alpha_\Sigma (E) \models'_{\Phi(\Sigma)} \alpha_\Sigma (\rho)\]

may be established by proof theoretic means. Now in order to get the conclusion back to \mathcal{I} we need a reflection property for the semantic consequence, which does not happen in general. However a standard way to ensure this is to check that the comorphism is conservative, which means that for any \Sigma-model M there exists a \Phi(\Sigma)-model M' such that \beta_\Sigma (M') = M.

d. Notes

The importance of comorphisms and other types of structure preserving mappings between institutions has been understood gradually, with the paper (Goguen and Ro¸su, 2002) presenting a systematic comparative overview of the different notions of morphisms between institutions. That paper also fixed a lot of current terminology, including for example the term ‘comorphism’.

The work (Mossakowski et al., 2005) gave the institution-theoretic answer for the contest question of the 1st World Congress on Universal Logic (Montreux, Switzerland, 2005), namely what is the identity of a logic? The answer was that it is an equivalence class of institutions under equivalence of institutions. Briefly, the concept of equivalence of institutions proposed by Mossakowski et al. (2005) is a comorphisms (or alternatively a morphism) that consists of categorical equivalences at all levels.

The borrowing of a consequence relation is the foundation for formal verification by translation within the context of logic-based formal system specification. A specification based upon a source institution \mathcal{I} may have very good expressivity and readability but \mathcal{I} may lack adequate proof support. The solution is to shift the formal verification process across an encoding to an institution \mathcal{I}' that is supported by good theorem proving tools, such as theorem provers, proof assistants, and so forth. Within the context of the heterogenous specification paradigm this has become a rather standard practice (Mossakowski, 2005), which is very economical since it makes good use of existing tools instead of building new ones.

5. Contributions to Computing Science

The contribution of institution theory to computing science is manifold, the most basic one being that it sets a standard style of developing new specification languages that requires at the beginning to define a logical system captured as institution and then develop all the language constructs solidly and rigorously backed by corresponding mathematical entities in the underlying institution.

a. Structured Specifications

Software systems tend to be very complex, and likewise their formal specifications. One key device for the management of this complexity is a structuring mechanism for specifications that allows to develop them in modular way. It has been noticed that structuring systems are largely, if not completely, independent of the logical systems underlying specification formalisms. Hence the idea to study the structuring of specifications and programs at a conceptual level that abstracts away the details of logical systems and instead exploits some of their compositionality properties. This has been the original motivation for institution theory.

Given an institution, considered abstractly, a structured specification is just a term formed by applying a specific fixed set of building operators to blocks consisting of finite set of sentences in the institutions (called basic, flat or unstructured specifications). Then by induction on the structure of a specification \mathit{SP} one may calculate its signature \mathrm{Sig}(\mathit{SP}) and its class of models \mathrm{Mod}(\mathit{SP}). Seminal works such as (Sannella and Tarlecki, 1988) have proposed a core set of specification-building operators consisting of renaming (translations), sum, hiding (derivation) which can express most of the concrete structuring operations in the actual specification languages. Often an initial semantics operator is added in order to deal with initial semantics specification modules. A lot of results have been developed on this conceptual basis, providing an uniform and solid foundation for modular software development. A particularly remarkable one is the layered completeness result of Borzyszkowski (2002) that lifts a proof system (considered in the form of an abstract entailment system) from the base institution to structured specifications; under a set of general conditions expressed as properties of the base institution, the completeness of the former yields the completeness of the latter.

A later development by Diaconescu (2012a) proposes an even more abstract approach that avoids particular choices of sets of specification building operators. This abstracts the (structured) specifications as signatures in an abstract ‘upper level’ institution \mathcal{I}', the relationship to the base institution \mathcal{I} (representing the underlying logic) being axiomatised as a special kind of institution morphism, whose sentence translations are identities, and whose model translations/reducts are inclusions.

b. Heterogeneous Specifications

The increasing complexity of current systems has led to an understanding of the limitations of specification formalisms based upon single logical systems. Hence the emergence of the heterogeneous specification environments based on multiple logical systems. A standard view of such heterogeneous logical environments, actually realised by CafeOBJ (Diaconescu and Futatsugi, 1998) and Hets (Mossakowski, 2005), is as diagrams of institutions that are linked by comorphisms. However this raises the difficult question of how to lift specification theory to the heterogenous level. The answer comes from a category-theoretic construction that originated from algebraic geometry by Grothendieck (1963), and that can be replicated to institutions in order to ‘flatten’ a diagram of institutions to a single institution that retains all data provided by the respective diagram of institutions. This construction is known as the Grothendieck institution associated to the respective diagram. The main idea is to aggregate together all institutions of the diagrams into a big institution and label all the entities by their origin (node or edge in the diagram). This means that a signature of the Grothendieck institution is a pair (i,\Sigma) where i is a node in the diagram and \Sigma is an signature in the institution at i. A signature morphism (i,\Sigma) \rightarrow (i',\Sigma') in the Grothendieck institution is a pair (u,\varphi) where u is an edge in the diagram i \rightarrow i' and \varphi is a signature morphism in the institution at i' from the translation of \Sigma across the institution comorphisms at u to \Sigma'. The (i,\Sigma)-sentences are just the \Sigma-sentences (at i) and likewise the models, but the translations of both sentences and models across (u,\varphi) make use of the respective translations given by the comorphism at u and the local translations corresponding to \varphi. The Grothendieck institution satisfaction relation \models_{(i,\Sigma)} is just \models_\Sigma of the institution at i.

A series of properties required by specification theory have been gradually established for Grothendieck institutions. These results usually provide sets of sucient (and often necessary) conditions for lifting of institution-theoretic properties from the local level of the component institutions to the global level of the Grothendieck institution.

c. Ontologies

Computer science ontologies can be regarded as logic-based formal specifications. On the basis of this observation Goguen (2006) introduces the institution-theoretic trend in theory of ontologies which represents more or less a rephrasing of well established specification theory concepts and results in an ontology theoretic setup. This has brought several notable gains for ontologies, such as very well developed structuring technologies and the Grothendieck institution approach to logical heterogeneity (Kutz et al., 2010).

This line of development plays the core role in the OMG standard Ontology, Modeling and Specification Integration and Interoperability (OntoIOp).

d. Logic Programming

The Herbrand theorems as foundations of the logic-programming paradigm have been developed at the level of abstract institutions by Diaconescu (2004c). This has opened the possibility to develop logic programming over non-conventional logical structures, thus providing solid foundations for the combination of logic programming and other computing paradigms. Particularly important results in this line of developments are the Herbrand-based foundations for constraint solving (Diaconescu, 2008), which show that at the denotational level constraint solving is just an instance of plain (abstract) logic programming, and logic programming for services (Ţuţu and Fiadeiro, 2015b) that extends the original approach over a concept of generalized substitution system (Ţuţu and Fiadeiro, 2015a).

e. Notes

The work on the specification language Clear (Burstall and Goguen, 1980) was very influential and preceded the institution-theoretic structuring of specifications. Another particularly influential work in this area was (Diaconescu et al., 1993).

Grothendieck institutions, preceded by an attempt by Diaconescu (1998) to lift ‘by-hand’ specification theory concepts and results to the heterogeneous level, have been originally introduced by Diaconescu (2002) but in a slightly improper way using institution morphisms at the edges of the diagram representing the heterogeneous environment; that was soon corrected by Mossakowski (2002) to comorphisms.

The treatment of variables as signature extensions outlined in Sec. 3.c plays a key role in the institution-independent study of logic programming. It is an essential feature of the institutional approach to Herbrand theorems, one that has enabled the development of some of the most fundamental semantic concepts to logic programming, like query and solution, in arbitrary institutions. Despite such mild assumptions, the logic-programming semantics of services (Ţuţu and Fiadeiro, 2015b) does not fit into the framework proposed in (Diaconescu, 2004c). This has led to an upgrade of the original approach, to Herbrand theorems over a concept of generalized substitution system (Ţuţu and Fiadeiro, 2015a) that extends institutions by allowing for direct representations of variables and substitutions, much in the spirit of context institutions (Pawlowski, 1996). The connection between the substitution-system-based and the original institution-independent approach to logic programming was studied in depth in (Ţuţu and Fiadeiro, 2015c).

6. Extensions

While non-classical logics can be captured properly as institutions, some of their fine aspects may be beyond the conventional institution theory. For example, the institutions of many-valued logics handle the ternary aspect of the satisfaction relation by adjoining truth value to the sentences; in this way we get a binary satisfaction relation. While this works well with respect to most aspects of many-valued logics, it cannot handle graded (non-binary) consequences. The same with modal logics, the conventional concept of institution does not allow for a general semantics of modalities. These shortcomings have been overcome by extensions of the definition of institution towards non-classical aspects of logics.

a. Many-valued Truth

The many-valued extension of institution theory is very simple, just replace the binary truth values with any set L of truth values. The satisfaction relation becomes a function

    \[\models_\Sigma \,\colon\; \mathrm{Sen}(\Sigma) \times \mathrm{Mod}(\Sigma) \rightarrow L\]

that for each sentence and model gives a truth value interpreted as a satisfaction degree. The Satisfaction Condition gets rephrased as

    \[(\mathrm{Mod}(\varphi)(M') \models_\Sigma \rho) = (M' \models_{\Sigma'} \mathrm{Sen}(\varphi)(\rho)).\]

Very often it is necessary that L comes as a partial order of some kind, such as lattice. In order to get a genuine many-valued implication one needs even more, that L is a residuated lattice.

b. Modalities

The stratified institutions of Aiguier and Diaconescu (2007) refine institutions by considering models with states. Thus each model M comes equipped with a designated set [\![ M ]\!], and this is subject to several coherent conditions. In the case of Kripke models M, [\![ M ] \!] gives the set of the possible worlds.

c. Notes

The idea to extend the definition of institution to many-valued truth arose at the beginning of institution theory (Mayoh, 1985) motivated by research in data base theory; this was already mentioned in (Goguen and Burstall, 1992). Later works on many-valued institutions include (Eklund and Helgesson, 2010; Diaconescu, 2013, 2014).

7. References and Further Reading

a. Primary Sources

  • Răzvan Diaconescu. Institution-independent Model Theory. Birkhäuser, 2008.
  • Joseph Goguen and Rod Burstall. Institutions: Abstract model theory for specification and programming. Journal of the Association for Computing Machinery, 39(1):95–146, 1992.
  • Joseph Goguen and Grigore Ro¸su. Institution morphisms. Formal Aspects of Computing, 13:274–307, 2002.
  • José Meseguer. General logics. In H.-D. Ebbinghaus et al., editors, Proceedings, Logic Colloquium, 1987, pages 275–329. North-Holland, 1989.
  • Till Mossakowski, Joseph Goguen, Răzvan Diaconescu, and Andrzej Tarlecki. What is a logic? In Jean-Yves Béziau, editor, Logica Universalis, pages 113–133. Birkhäuser, 2005. 19
  • Till Mossakowski, Răzvan Diaconescu, and Andrzej Tarlecki. What is a logic translation? Logica Universalis, 3(1):59–94, 2009.
  • Donald Sannella and Andrzej Tarlecki. Foundations of Algebraic Specifications and Formal Software Development. Springer, 2012.

b. Secondary Sources

  • Marc Aiguier and Răzvan Diaconescu. Stratified institutions and elementary homomorphisms. Information Processing Letters, 103(1):5–13, 2007. 01.
  • Arrais and José L. Fiadeiro. Unifying theories in different institutions. In Magne Haveraaen, Olaf Owe, and Ole-Johan Dahl, editors, Recent Trends in Data Type Specification, volume 1130 of Lecture Notes in Computer Science, pages 81–101. Springer, 1996.
  • Edigio Astesiano, Michel Bidoit, Hélène Kirchner, Berndt Krieg-Brückner, Peter Mosses, Don Sannella, and Andrzej Tarlecki. CASL: The common algebraic specification language. Theoretical Computer Science, 286(2):153–196, 2002.
  • Tomasz Borzyszkowski. Logical systems for structured specifications. Theoretical Computer Science, 286(2):197–245, 2002.
  • Rod Burstall and Joseph Goguen. The semantics of Clear, a specification language. In Dines Bjorner, editor, 1979 Copenhagen Winter School on Abstract Software Specification, volume 86 of Lecture Notes in Computer Science, pages 292–332. Springer, 1980.
  • Maura Cerioli and José Meseguer. May I borrow your logic? (transporting logical structures along maps). Theoretical Computer Science, 173:311–347, 1997.
  • Mihai Codescu and Daniel Găină. Birkhoff completeness in institutions. Logica Universalis, 2(2):277–309, 2008.
  • Răzvan Diaconescu. Extra theory morphisms for institutions: logical semantics for multi-paradigm languages. Applied Categorical Structures, 6(4):427–453, 1998. A preliminary version appeared as JAIST Technical Report IS-RR-97-0032F in 1997.
  • Răzvan Diaconescu. Grothendieck institutions. Applied Categorical Structures, 10(4):383–402, 2002. Preliminary version appeared as IMAR Preprint 2-2000, ISSN 250-3638, February 2000.
  • Răzvan Diaconescu. Institution-independent ultraproducts. Fundamenta Informaticæ, 55(3-4):321–348, 2003.
  • Răzvan Diaconescu. An institution-independent proof of Craig Interpolation Theorem. Studia Logica, 77(1):59–79, 2004a.
  • Răzvan Diaconescu. Elementary diagrams in institutions. Journal of Logic and Computation, 14(5):651–674, 2004b.
  • Răzvan Diaconescu. Herbrand theorems in arbitrary institutions. Information Processing Letters, 90:29–37, 2004c.
  • Răzvan Diaconescu. An axiomatic approach to structuring specifications. Theoretical Computer Science, 433:20–42, 2012a.
  • Răzvan Diaconescu. Borrowing interpolation. Journal of Logic and Computation, 22(3):561–586, 2012b.
  • Răzvan Diaconescu. Institutional semantics for many-valued logics. Fuzzy Sets and Systems, 218:32–52, 2013.
  • Răzvan Diaconescu. Graded consequence: an institution theoretic study. Soft Computing, 18(7):1247–1267, 2014.
  • Răzvan Diaconescu and Kokichi Futatsugi. CafeOBJ Report: The Language, Proof Techniques, and Methodologies for Object-Oriented Algebraic Specification, volume 6 of AMAST Series in Computing. World Scientific, 1998.
  • Răzvan Diaconescu and Marius Petria. Saturated models in institutions. Archive for Mathematical Logic, 49(6):693–723, 2010.
  • Răzvan Diaconescu and Petros Stefaneas. Ultraproducts and possible worlds semantics in institutions. Theoretical Computer Science, 379(1):210–230, 2007.
  • Răzvan Diaconescu, Joseph Goguen, and Petros Stefaneas. Logical support for modularisation. In Gerard Huet and Gordon Plotkin, editors, Logical Environments, pages 83–130. Cambridge, 1993. Proceedings of a Workshop held in Edinburgh, Scotland, May 1991. 20
  • Patrick Eklund and Robert Helgesson. Monadic extensions of institutions. Fuzzy Sets and Systems, 161:2354–2368, 2010.
  • Joseph Goguen. Types as theories. In George Michael Reed, Andrew William Roscoe, and Ralph F. Wachter, editors, Topology and Category Theory in Computer Science, pages 357–390. Oxford, 1991. Proceedings of a Conference held at Oxford, June 1989.
  • Joseph Goguen. Data, schema, ontology and logic integration. Journal of IGPL, 13(6):685–715, 2006.
  • Joseph Goguen and Rod Burstall. Introducing institutions. In Edward Clarke and Dexter Kozen, editors, Proceedings, Logics of Programming Workshop, volume 164 of Lecture Notes in Computer Science, pages 221–256. Springer, 1984.
  • Daniel Găină. Forcing, downward Löwenheim-Skolem and omitting types theorems, institutionally. Logica Universalis, 8(3): 469–498, 2014.
  • Daniel Găină and Marius Petria. Completeness by forcing. Journal of Logic and Computation, 20(6):1165–1186, 2010.
  • Daniel Găină and Andrei Popescu. An institution-independent proof of Robinson consistency theorem. Studia Logica, 85(1): 41–73, 2007.
  • Daniel Găină, Kokichi Futatsugi, and Kazuhiro Ogata. Constructor-based logics. J. of Universal Computer Science, 18 (2204–2233), 2012.
  • Oliver Kutz, Till Mossakowski, and Dominik Lücke. Carnap, Goguen, and the hyperontologies – logical pluralism and heterogeneous structuring in ontology design. Logica Universalis, 4(2):255–333, 2010.
  • Brian Mayoh. Galleries and institutions. Technical Report DAIMI PB-191, Aarhus University, 1985.
  • Till Mossakowski. Comorphism-based Grothendieck logics. In K. Diks and W. Rytter, editors, Mathematical foundations of computer science, volume 2420 of Lecture Notes in Computer Science, pages 593–604. Springer, 2002.
  • Till Mossakowski. Heterogeneous specification and the heterogeneous tool set. Habilitation thesis, University of Bremen, 2005.
  • Wieslaw Pawlowski. Context institutions. In Magne Haveraaen, Olaf Owe, and Ole-Johan Dahl, editors, Recent Trends in Data Type Specification, volume 1130 of Lecture Notes in Computer Science, pages 436–457. Springer, 1996.
  • Marius Petria and Răzvan Diaconescu. Abstract Beth definability in institutions. Journal of Symbolic Logic, 71(3):1002–1028, 2006.
  • Donald Sannella and Andrzej Tarlecki. Specifications in an arbitrary institution. Information and Control, 76:165–210, 1988.
  • Andrzej Tarlecki. On the existence of free models in abstract algebraic institutions. Theoretical Computer Science, 37:269–304, 1986a.
  • Andrzej Tarlecki. Bits and pieces of the theory of institutions. In David Pitt, Samson Abramsky, Axel Poigné, and David Rydeheard, editors, Proceedings, Summer Workshop on Category Theory and Computer Programming, volume 240 of Lecture Notes in Computer Science, pages 334–360. Springer, 1986b.
  • Andrzej Tarlecki. Quasi-varieties in abstract algebraic institutions. Journal of Computer and System Sciences, 33(3):333–360, 1986c.
  • Ionuţ Ţuţu and José L. Fiadeiro. From conventional to institution-independent logic programming, 2015a.
  • Ionuţ Ţuţu and José L. Fiadeiro. Service-oriented logic programming, 2015b.
  • Ionuţ Ţuţu and José L. Fiadeiro. Revisiting the institutional approach to Herbrand’s theorem. In Algebra and Coalgebra in Computer Science, Leibniz International Proceedings in Informatics. Schloss Dagstuhl, 2015c.
  • George Voutsadakis. Categorical abstract algebraic logic: Algebrizable institutions. Applied Categorical Structues, 10: 531–568, 2002.

c. Auxiliary Non-Institutional Sources

  • Jon Barwise. Axioms for abstract model theory. Annals of Mathematical Logic, 7:221–265, 1974. 21
  • Jean-Yves Béziau, editor. Universal Logic: an Anthology. Studies in Universal Logic. Springer Basel, 2012.
  • Chen-Chung Chang and H. Jerome Keisler. Model Theory. North Holland, Amsterdam, 1990.
  • Samuel Eilenberg and Saunders Mac Lane. General theory of natural equivalences. Transactions of the American Mathematical Society, 58:231–294, 1945.
  • Joseph Goguen. A categorical manifesto. Mathematical Structures in Computer Science, 1(1):49–67, March 1991. Joseph Goguen. Programming Research Group Technical Monograph PRG–72, Oxford University, March 1989.
  • Alexandre Grothendieck. Catégories fibrées et descente. In Revêtements étales et groupe fondamental, Séminaire de Géométrie Algébraique du Bois-Marie 1960/61, Exposé VI. Institut des Hautes Études Scientifiques, 1963. Reprinted in Lecture Notes in Mathematics, Volume 224, Springer, 1971, pages 145–94.
  • Andrei N. Kolmogorov. On the principle of the excluded middle. Matematicheskii Sbornik, 32(646–667), 1925. (in Russian).
  • Jerzy Ło´s. Quelques remarques, théorèmes et problèmes sur les classes définissables d’algèbres. In Mathematical Interpretation of Formal Systems, pages 98–113. North-Holland, Amsterdam, 1955.
  • Saunders Mac Lane. Categories for the Working Mathematician. Springer, second edition, 1998.
  • Günter Matthiessen. Regular and strongly finitary structures over strongly algebroidal categories. Canadian Journal of Mathematics, 30:250–261, 1978.
  • Alfred Tarski. The semantic conception of truth. Philos. Phenomenological Research, 4:13–47, 1944.
  • Johan van Bentham. Modal Logic and Classical Logic. Humanities Press, 1988.

 

Author Information

Răzvan Diaconescu
Email: Razvan.Diaconescu@imar.ro
Simion Stoilow Institute of Mathematics of the Romanian Academy
Romania

Francisco Suárez (1548—1617)

SuarezSometimes called the “Eminent Doctor” after Paul V’s designation of him as doctor eximius et pius, Francisco Suárez was the leading theological and philosophical light of Spain’s Golden Age, alongside such cultural icons as Miguel de Cervantes, Tomás Luis de Victoria, and El Greco. Although initially rejected on grounds of deficient health and intelligence when he attempted to join the rapidly growing Society of Jesus, he attained international prominence within his lifetime. He taught at the schools in Segovia, Valladolid, Rome, Alcalá, Salamanca, and finally at Coimbra, the last at Philip II’s insistence.

Not all of the attention Suárez received was positive. His Defensio fidei, published in 1613, defended a theory of political power that was widely perceived to undermine any monarch’s absolute right to rule. He explicitly permitted tyrannicide and argued that even monarchs who come to power legitimately can become tyrants and thereby lose their authority. Such views led to the book being publically burned in London and Paris.

Suárez was clearly a scholastic in style and temperament, despite coming after the rise of humanism and living on the cusp of what is usually identified as the era of modern philosophy. His writings are sometimes said to contain the whole of scholastic philosophy because in addressing a question he surveys the full range of scholastic positions—Thomist, Scotist, nominalist, and others—before affirming one of those positions or presenting his own variant. The position he ultimately settles on is likely to be a via media.

Suárez’s greatness as a philosopher comes precisely from his magisterial weighing of all the competing positions across an extraordinarily broad range of theological and philosophical issues. The combination of broad systematicity, detailed elaboration, and thorough argumentation for his preferred view and against contrary views finds few rivals. He is a philosopher’s philosopher.

The even-handed presentation of the panoply of scholastic positions also explains why Suárez’s writing served as one of the key conduits through which medieval philosophy influenced early modern philosophy. Descartes, Leibniz, and Wolff, among others, learned scholasticism at least in part from reading Suárez, a scholasticism from which they then borrowed in developing their own philosophical theories.

Table of Contents

  1. Life
  2. Writings
  3. Thought
    1. Metaphysics and Theology
    2. Distinctions
    3. Esse and Essentia
    4. Efficient and Final Causation
    5. Existence of God
    6. Categories and Genera
    7. Beings of Reason
    8. Middle Knowledge, Grace, and Providence
    9. Natural Law and Obligation
    10. Political Authority
  4. Legacy
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Francisco Suárez was born on January 5, 1548 in Granada to Gaspar Suárez de Toledo and Antonia Vázquez de Utiel, a mere half-century after the Catholic Monarchs, Ferdinand and Isabella, wrested the city from eight centuries of Muslim control. The family was prosperous, although members of earlier generations on the maternal side ran into trouble with the Inquisition due to Jewish lineage, and several members were burned at the stake. Suárez’s uncle, Francisco Toledo de Herrera, was a prominent professor of philosophy who became the first Jesuit cardinal. His extended family included others of some note, including an arch-bishop cardinal and a viceroy of Peru.

Suárez was the second son of eight children. His brother Baltasar also joined the Jesuit order and was sent to the Philippines but died en route. Another brother became a priest, and three sisters joined a Jeronymite convent. Whether these facts indicate an exceptionally devout family or an attempt to mitigate some converso ancestry is less clear.

His childhood seems to have been unexceptional. He was schooled in Latin and rhetoric until thirteen, at which point he went to Salamanca to study canon law for three years. His academic performance at this point was lackluster. During his time at Salamanca he heard the legendary sermons of the Jesuit Juan Ramirez and felt the call to join the Jesuit order. Stunning in retrospect, Suárez was rejected from the then-fledgling order on grounds of insufficient intellectual aptitude. Suárez persisted, and finally, after numerous appeals, he was admitted to the order in 1564 as an indifferent, that is, provisionally accepted with the understanding that he might be refused entry into the priesthood.

At this point, Suárez’s academic abilities seem to have flowered so suddenly that some biographers attribute the flowering to the miraculous invention of Mary, the Mother of God. His newfound academic abilities did not go unnoticed, and he was sent to study theology at the University of Salamanca, then one of the most prestigious universities at the height of its glory and at the center of the Iberian revival of scholasticism. In 1570 he performed the “Grand Act” at Salamanca, something done only by the most gifted students. The Grand Act was a public academic exercise, resembling a medieval quodlibetal dispute, in which the student needed to be able to answer questions and resolve difficulties posed by professors and visitors. Suárez had enough of a reputation already to ensure prominent guests and a full hall. His new reputation also led his superiors to have him start teaching philosophy rather than first teaching grammar or rhetoric as was usually the case. Between 1570 and 1580, Suárez taught at several institutions around Spain: Salamanca, Segovia, Valladolid, and Avila. His earliest works come from this period.

Charges of novelty also started in this period. On Suárez’s telling, the problem was that he refused to teach by dictation from copy books but rather searched “for truth at its very roots.” The resulting controversy may have factored into his new position. In 1580 Suárez was called to Rome to join the Collegio Romano and to contribute to the development of the famous Jesuit pedagogical document, the Ratio Studiorum. The Roman College was an intellectually stimulating place to be, including such luminaries as the humanist Francisco Sanches, the theologians Robert Bellarmine and Gregory of Valencia, and the mathematician Christopher Clavius. Suárez also became a close colleague of the extraordinarily young general of the Jesuit order, Claudio Acquaviva.

In 1585 Suárez started teaching at the University of Alcalá. His seven years there were marked by strife with other theologians, including, notably, with his colleague Gabriel Vasquez, a dispute that left its mark on Jesuit philosophical history for generations. Suárez was unhappy with the distraction of all these conflicts and requested release from his position. Finally, in 1592, he was sent back to the university of his student days, Salamanca, where he wrote his best-known work, the Disputationes metaphysicae.

In 1597, the same year that Disputationes metaphysicae was published, Suárez moved yet again, this time to Coimbra in Portugal at the behest of Philip II. The first time Philip asked Suárez to move to Coimbra, in 1596, Suárez declined since he feared Coimbra’s teaching responsibilities would keep him from his writing projects. Because Spain had claimed Portugal during the 1580 Portuguese succession crisis, Suárez may also have feared the political situation, since the Portuguese were likely to be less than welcoming of a Spanish professor appointed by a Spanish king (even if the Spanish king was also Philip I of Portugal). Furthermore, Suárez would have occupied a new Jesuit chair at a Dominican university and tensions were running high between the two orders. Philip was disinclined to accept Suárez’s declination, but granted it after a personal visit. Unfortunately, the person appointed in his stead died at the end of 1596, raising the issue all over again. This time Philip insisted that Suárez move to Coimbra. An amusing consequence was that Suárez now needed to acquire a doctoral degree, since the Coimbra faculty objected to Suárez’s post without one. The Jesuit Provincial in Lisbon was happy to confer one, but this failed to satisfy. Finally, Suárez made a trip to the University of Evora in southern Portugal, where he directed a public theological debate and was rewarded with a doctorate.

Suárez stayed at Coimbra until his retirement in 1613, although this was a retirement only from teaching. Among other projects, Suárez hoped to revise an earlier set of lecture notes into a commentary on Aristotle’s De anima. The revision remained unfinished, however, at Suárez’s death on September 25, 1617.

2. Writings

By Suárez’s time, Aquinas’s Summa theologiae (henceforth, ST) had to some extent replaced Lombard’s Sentences as the standard theological textbook and subject of commentaries. Many of Suárez’s works are offered as commentaries on particular sections of ST. Suárez generally does not, however, offer line-by-line comments on Aquinas’s texts; still, the arrangement of subjects in his work mirrors that in Aquinas’s text, the questions he considers often grow out of it, and he constantly cites passages from it.

In correspondence to ST Ia (that is, the First Part), Suárez wrote De Deo uno et trino (“On God, One and Triune”), De angelis (“On Angels”), De opere sex dierum (“On the Work of the Six Days [of Creation]”), and De anima (“On the Soul”). The last work, in particular, has received significant attention from Suárez scholars and is the primary source for his psychological views. One should also note in passing that titles are apt to be a source of confusion for those less familiar with Suárez’s works, since scholars will sometimes use titles for smaller or larger divisions of his work. For example, references to De divina substantiae ejusque attributis, De divino praedestinatione, and De SS. Trinitatis mysterio are all to treatises that make up the De Deo uno et trino mentioned above. Conversely, the latter three works mentioned above are sometimes gathered under the title De Deo effectore creaturarum omnium, though they are more commonly cited individually. Listing all the variations would be tedious, but readers should note that a citation of a seemingly unfamiliar work of Suárez’s might simply be using the title of a collection of treatises or the title of a part of a larger collection.

In correspondence to ST IaIIae (that is, the First Section of the Second Part), Suárez wrote De ultimo fine hominis (“On the Ultimate End of Man”), De voluntario et involuntario (“On the Voluntary and Involuntary”), De bonitate et malitia actuum humanorum (“On the Goodness and Evil of Human Acts”), De pasionibus et habitibus (“On Passions and Habits”), and De vitiis atque peccatis (“On Vices and Sins”), all published together in one volume under the title Tractatus quinque ad primam secundae D. Thomae. These works have received less scholarly attention, although they are obviously of great relevance for determining Suárez’s ethical views. De legibus seu de Deo legislatore (“On Laws or on God the Lawgiver”) also corresponds to IaIIae and has received more attention as one of Suárez’s greatest and most influential works. Seeing De legibus as a commentary on Aquinas’s ST helps to understand it properly, as well. It is sometimes read as a comprehensive presentation of Suárez’s ethical views, but Suárez himself conceives of it as corresponding to questions 90-108 of ST IaIIae. Those questions, of course, constitute a small fraction of Aquinas’s ethical thought. Finally, Suárez’s massive De gratia (“On Grace”) and some of the shorter theological works he wrote in connection with the controversy De auxiliis (to be addressed later) present an extraordinarily detailed account of grace and connected theological matters.

Suárez wrote fewer works on ST IIaIIae (a fact that is grist for the scholarly mill), but De fide theologica (“On Theological Faith”), De spe (“On Hope”), De caritate (“On Charity”), and the again massive De virtute et statu religionis (“On the Virtue and State of Religion”) do correspond to it. The last work offers a defense of the Society of Jesus along with a detailed examination of its principles. These works generally have received rather little scholarly attention, with the exception of the well-known disputation on war included in De caritate. That disputation is the main source for Suárez’s just-war theory.

Corresponding to ST IIIa are De verbo incarnato (“On the Incarnate Word”), De mysteriis vitae Christi (“On the Mysteries of Christ’s Life”), De sacramentis (“On the Sacraments”), and De censuris (“On Censures”). These works have received little attention from recent philosophers, but slightly more from theologians. De mysteriis vitae Christi, in particular, is significant for including several hundred pages of discussion of questions related to the Blessed Virgin Mary. It is this work that has earned Suárez a prominent place in the history of systematic Mariology.

Not all of Suárez’s works fit into the ST framework, especially those of his works sparked by religious and political controversies of his day. The famous controversy De auxiliis raged between the Dominicans and Jesuits during Suárez’s lifetime and he contributed a number of works defending the Jesuit position. The central issue was how to explain human free will on the one side and divine foreknowledge, providence, and grace on the other side in such a way that they were compatible with each other and without falling into one heresy or another. Suárez contributed a number of works dealing with these matters: De vera intelligentia auxilii efficacis (“On the True Understanding of Efficacious Aid”), De concursu, motione et auxilio Dei (“On God’s Concurrence, Motion, and Aid”), De scientia Dei futurorum contingentium (“On God’s Knowledge of Future Contingents”), De auxilio efficaci (“On Efficacious Aid”), De libertate divinae voluntatis (“On the Freedom of the Divine Will”), De meritis mortificatis, et per poenitentiam reparatis (“On Merits Destroyed and Revived through Penance”), and De justitia qua Deus reddit praemia meritis, et poenas pro peccatis (“On the Justice by Which God Gives Rewards for Merits and Punishment for Sins”).

More political in nature is the treatise De immunitate ecclesiastica a Venetis violata (“About the Ecclesiastical Immunity Violated by the Venetians”), a work written in defense of the papal position in the dispute between the papacy and the Republic of Venice about the extent of papal jurisdiction.

Best-known of Suárez’s controversial works is his response to the English monarch James I, Defensio fidei catholicae adversus anglicanae sectae errores (“A Defense of the Catholic Faith Against the Errors of the Anglican Sect”), written at the request of the papal nuncio in Madrid. Suárez’s initial aim was to respond to James I’s arguments for requiring Catholic subjects to take an oath of loyalty. The work became much more than that, however, and offers much of interest to theorists of political power. In it, Suárez opposes the absolute right of monarchs, argues that the papacy has indirect power over temporal rulers, and, perhaps most inflammatory, argues that there are situations in which citizens may legitimately resist a tyrannical monarch to the point of tyrannicide. The work was promptly condemned by the English king and publically burned in London. James tried to enlist the support of other European monarchs in condemning Suárez and the Jesuit order more generally with some success, especially in France.

Last but certainly not least, Suárez’s best-known and most influential work is his Disputationes metaphysicae (“Metaphysical Disputations”; henceforth, DM). The work is meant to cover the questions pertaining to the twelve books of Aristotle’s Metaphysics, but part of its significance lies in its not being a commentary on Aristotle’s work and not following its organization. Suárez is unhappy with Aristotle’s organization and so writes a large, two-volume work in which he sets out a comprehensive treatment of metaphysics in systematic fashion, organized into fifty-four disputations. The work is widely credited with being the first work to offer a comprehensive, systematically organized metaphysics and with initiating a long tradition of such works. DM was immediately and extraordinarily popular, widely reprinted all over Europe and quickly became the standard work to consult on metaphysical matters.

DM is divided into two main parts. The first part deals with the object of metaphysics, the concept of being, the transcendentals (that is, the essential properties of being as such, namely, unity, truth, and goodness), and the causes of beings. The second part deals with the divisions of being, first into infinite and finite. Finite being is then divided into substance and accident, with the latter then divided into the nine categories familiar from Aristotle. The last disputation concerns beings of reason, which, strictly speaking, fall outside the scope of metaphysics on Suárez’s conception, but an understanding of which is helpful for understanding metaphysics.

Aside from its systematic scope, few readers of DM have failed to be impressed by the extraordinary erudition displayed in its discussions. Each section includes a careful cataloguing of all the different positions that have been—and might be—taken with respect to the issue under question. DM’s two volumes include thousands of citations of hundreds of authors: Christian scholastic, Muslim, Patristic, ancient, and others. This may be the feature Schopenhauer has in mind when he characterizes DM as the “authentic compendium of the whole of scholastic wisdom.”

It is worth noting at this point that Suárez comes after centuries of a continuous tradition of professional theology and philosophy and consequently inherits a formidable accumulation of distinctions, technical terminology, and the like. Standard arguments and positions are often assumed or indicated in summary fashion. The resulting work is exceptionally sophisticated and can be highly rewarding to a reader with some familiarity with that tradition, but it can also be forbidding and alien to readers not so familiar.

The situation is not helped by the fact—astonishing in light of Suárez’s caliber and influence—that very little of his work is available either in critical edition or translation. Suárez’s works suffer far fewer textual issues than many medieval works do, so the lack of critical editions impedes less than it might. A significant number of his works were published posthumously, however, and the editorial decisions of his literary executor, Baltasar Alvarez, sometimes leave something to be desired. The value of translations, of course, is evident, yet aside from significant portions of DM very little has been translated into English (or any other language, for that matter). As the preceding presentation should have made clear, of course, Suárez was awe-inspiringly prolific, so a complete translation would be a monumental task. Furthermore, in addition to the mentioned works, there are still several unedited and unpublished works.

3. Thought

The first thing to note when trying to get one’s bearings with Suárez’s thought is that he, like many other scholastic authors, is a Christian Aristotelian. His thought is so thoroughly imbued with both Christianity and Aristotelianism that it would be difficult to find a single page in his writings not containing obvious traces of both. As is well-known, Aristotle says some things incompatible with orthodox Christianity, such as that the world had no beginning, so an orthodox Christian must modify Aristotelianism to some degree. Nonetheless, one can better understand what Suárez says and why he says it once recognizing that he is committed to orthodox Christian doctrines—more particularly, Roman Catholic doctrines—and that his philosophical framework and conceptions are rooted in Aristotle.

On the Christianity side, Suárez is committed to there being a God, a God who created and sustains human beings and the world they inhabit. God is a Trinity of Father, Son, and Holy Spirit, but also One, indeed, utterly simple. God exists necessarily, is infinite, immutable, and eternal. God is perfectly good, omniscient, and omnipotent, and orders everything in the universe providentially. Nevertheless, human beings sinned, consequently requiring God’s grace to attain their salvation and their ultimate end, that is, God. A central part of this story is the Incarnation of the second member of the Trinity, in which Christ assumes a human nature in addition to his divine nature. A moment’s reflection suffices to see that these doctrines raise a host of philosophical issues; much of Suárez’s work is devoted to such issues.

On the Aristotelian side, the most notable inheritances are hylomorphism (the view that objects are composed of matter and form); the four-causes explanatory paradigm (material, formal, efficient, and final); the categorial scheme of substance and nine categories of accidents; the view that the human soul is the form of the body; and the language of ultimate ends, happiness, and virtue in ethics. As scholars of medieval philosophy well know, this Aristotelian legacy leaves considerable room for philosophical disputes, in part due to unanswered questions, in part due to answers susceptible to differing interpretations.

The three authors most frequently cited by Suárez are Aristotle, Aquinas, and Scotus, in that order. Counting citations is not generally an infallible guide to influence, but in this case the citations seem an accurate reflection of influence. Aristotle’s influence is pervasive. Suárez claims to follow Aquinas throughout, though he no doubt exaggerates his fidelity to Aquinas. (The Jesuit order of which he was a member required fairly close adherence to Aristotle in matters of philosophy and Aquinas in matters of theology.) The number of references to Scotus, however, suggest that Suárez also had great respect for his thought. How much this respect led to influence is a matter of some controversy. Detecting such influence is made more difficult by Suárez’s practice of presenting his own view as that of Aquinas (properly interpreted), even where one might doubt their harmony. Closer examination of Suárez’s views confirms that his respect for Scotus at least occasionally led to adopting broadly Scotistic views.

The enormous range of issues addressed in Suárez’s writings ensures the impossibility of surveying all of them in an encyclopedia entry. What follows is a sampling of the issues that have received at least some scholarly attention, but it should be noted that a variety of significant topics, for example, his just war theory and his psychological views, have been omitted. The order roughly follows the order of DM with several issues drawing primarily on other works appended at the end.

a. Metaphysics and Theology

There are at least two issues concerning Suárez’s conception of metaphysics that have been the source of some controversy. Both issues also relate to the perennially enticing question of whether Suárez is the last medieval or the first modern.

Suárez opens his systematically ordered Disputationes metaphysicae by asking what the object of metaphysics is. The question received much discussion in scholastic philosophy, no doubt in part because Aristotle suggests several candidates that do not look equivalent in his Metaphysics. Suárez canvasses and criticizes six answers before giving his own answer: “being insofar as it is real being” (DM 1.1.26). Real being includes both infinite being (God) and finite being, both substances and real accidents. What it excludes, notably, is beings of reason, that is, beings that cannot exist other than as objects of thought. Note the “cannot.” On Suárez’s conception, real being includes both actual being and possible being. It does not, however, include privations, for example, blindness, and other such beings of reason.

Suárez considers metaphysics a unified science in the Aristotelian sense. The function of a science is to demonstrate the properties of its object through the latter’s principles and causes (DM 1.1.27 and 1.3.1). The term “properties” here is not being used in its wide contemporary sense but rather to refer to features necessarily possessed by the members of a kind yet not essential to them. The classic example is the capacity to laugh, which scholastics generally deem a feature necessarily possessed by human beings yet do not deem an essential feature, as rationality is. The function of metaphysics, then, is to identify the properties of being insofar as it is real being, and to demonstrate their necessary possession by appeal to the principles and causes of being insofar as it is real being.

The immediate objection is that being, insofar as it is real being, has neither necessary features that can be demonstrated nor principles and causes (DM 1.1.27). Consequently, it fails to make a suitable object for metaphysics. With respect to the first part, the thought is that being insofar as it is real being is so far abstracted that it has no properties. Trees, rocks, angels, and so forth all have properties, but what property does the being common to every being have? With respect to the second part, on the view that Suárez shares, the being par excellence is God, and God has no causes. Suárez, however, denies both claims. Being insofar as it is real being may not have any properties really distinct from it, but it does have properties that are at least distinct in reason, namely, the transcendentals: unity, truth, and goodness. In a similar move, Suárez denies that a science need appeal to causes, strictly speaking. Rather, appeal to principles or cause in a looser sense is sufficient. His example is that we can demonstrate God’s unity from God’s perfection, even though the latter is not strictly speaking a cause of the former (DM 1.1.28-29).

This conception of metaphysics might seem much narrower than more recent conceptions. It might suggest that all Suárez needed to do to finish DM is to demonstrate being’s unity, truth, and goodness and he would be done. As the size of DM attests, however, Suárez addresses a great many more topics. In the first place, he also includes disputations on the contraries of the transcendentals, namely distinction, falsity, and evil. Secondly, he includes one of the most thorough treatments of causation ever written on grounds (i) that even if being insofar as it is real being does not have a cause, most beings do and (ii) that all being is in some way a cause even if not itself caused. Finally, the entire second half of DM deals with the divisions of beings and includes lengthy disputations about the nature of particular kinds of beings such as substances, qualities, and so forth. In practice, then, the range of subjects discussed in DM comes much closer to the range of subjects that would be discussed in a modern metaphysics textbook.

Turning to the two matters of controversy mentioned earlier, the first question one may ask is how realist Suárez’s account of metaphysics is. An oft-told narrative has it that Aristotle and his medieval followers held metaphysics to concern the real rather than the mental, while at some point in the modern era metaphysics came to be about structures of thought (or something similarly mental) rather than about the extramental world. On the face of it, Suárez’s account seems impeccably realist. After all, metaphysics’ object is being insofar as it is real being. Despite seeming this way, Suárez has often been identified as one of the key figures in the transition to a non-realist approach to metaphysics. The room for debate comes with a later passage in which Suárez says that metaphysics’ object is “the objective concept of being as such” (DM 2.1.1). This is not obviously contradictory to the earlier statements that the object is being insofar as it is real being, but neither is it obviously in harmony. The question now becomes how Suárez understands objective concepts. If objective concepts borrow their ontological status from their objects, so to speak, then Suárez can readily be read as a realist. (And note that on Suárez’s view, objective concepts are concepts by extrinsic denomination; they are not really distinct from the objects. See, for example, DM 2.1.1 and 6.5.3.) On the other hand, if an objective concept is a mental item, then this later passage opens the door to less realist readings. Resolving this protracted dispute in the literature is obviously beyond the scope of this article.

The second question concerns the relationship of metaphysics to theology. Suárez makes it perfectly clear in the preface to DM that he adheres to the standard medieval view that metaphysics, albeit perfective of the human mind in its own right, ought to be in service of theology. He is, however, frustrated with piecemeal metaphysical discussions scattered throughout theological treatises, and so he writes DM to present a comprehensive metaphysics in the proper “order of teaching.” In this project, some detect a distinctly modern attitude and an early example of an increasing trend to see metaphysics as a separate discipline from theology. Others, however, are skeptical and think that Suárez sees metaphysics as distinct in the same way that his medieval predecessors did (namely, insofar as metaphysics proceeds by the “natural light of reason” apart from divine revelation) but in no additional way (for example, not by thinking that metaphysics should be conducted wholly autonomously).

b. Distinctions

The question of whether one item is distinct from another comes up in many contexts. One such place was already seen; namely, whether objective concepts are distinct from their objects. A naïve approach might just ask whether two items, A and B, are distinct or not. Is the human mind distinct from the brain? Are relations distinct from their relata? Is God’s mercy distinct from God’s justice? And so forth. Sustained philosophical reflection soon shows, however, that different sorts of distinctions might be posited between entities. Perhaps God’s mercy is identical to God’s justice in one sense but not distinct in another sense. Suárez’s scholastic predecessors frequently felt the need for a distinction between distinctions, so to speak, and so posited a plethora of distinctions. Suárez also recognizes the need for different distinctions but aims to prune the list down to three basic kinds: real, modal, and of reason. He argues in DM 7 that all other putative distinctions are actually one of these three kinds.

The two most obvious kinds of distinction are real distinctions and distinctions of reason (the two kinds that already made a showing in the previous section). As might be expected, a real distinction is the sort one expects between one thing and another thing. It is crucially an extramental distinction, one whose sign is mutual separability, that is, the possibility of either thing existing without the other. Distinctions of reason are distinctions in mind only. There may be a distinction of reason between Mark Twain and Samuel Clemens but there is no real distinction. A notable context in which Suárez wishes to appeal to distinctions of reason is when discussing God’s attributes. The doctrine of divine simplicity entails that the divine attributes can only be distinct in reason. One complication for the notion of distinctions of reason is that Suárez thinks some of them have some sort of foundation in reality while others do not.

Suárez argues that in addition to real distinctions and distinctions of reason there is a third kind of distinction, intermediate between the other two. The mark of a real distinction is mutual separability while the mark of a modal distinction is one-way separability. On Suárez’s view, for example, the union of form and matter cannot exist without the form and matter, but form and matter can exist without their union. If the union itself were a really distinct thing, Suárez thinks, then a further union would be needed to unite the union to the form and matter, and so an infinite regress would have started. So union is not some really distinct thing. But union is not merely distinct in reason, since the form and matter can exist without being united. Consequently, an intermediate kind of distinction is needed.

The reason mutual separability and one-way separability are marks or signs of the corresponding distinctions rather than simply constituting those distinctions is because of some important exceptions. On Suárez’s view there is only a one-way separability between God and creatures. God can exist without creatures but not vice versa. Assuming one is not a monist à la Spinoza, that requires granting a case of real distinction without mutual separability.

c. Esse and Essentia

In Being and Some Philosophers, Etienne Gilson famously distinguishes four fundamental traditions in metaphysics on the basis of how being is understood, two of which are existentialism and essentialism. Both terms have been used in an unhelpfully wide array of senses, but for Gilson an existentialism privileges existence over essence while an essentialist does the converse. In Gilson’s story, Thomas Aquinas is the hero who recognizes being as the very act of existing and metaphysics as the science of being insofar as it is being. Suárez, however, he sees as the paradigmatic essentialist in whose philosophy existence is no longer significant and for whom metaphysics becomes the science of essences. One may recall here that Suárez identifies being insofar as it is real being, but includes both actual and merely possible being in real being. Gilson thinks, furthermore, that this essentialism gives Suárez a pivotal role in history, albeit a malignant one, since essentialism leads to a variety of further philosophical ills.

Suárez’s conception of metaphysics figures in this story, but so does his account of the distinction between esse (“being,” meaning here actual existence) and essentia (“essence,” meaning here an individual nature), a matter related to a variety of issues regarding necessary versus contingent existence, eternal truths, the possibility of Aristotelian science about contingent things, and so forth. The usual Thomist view—how precisely to understand Thomas himself is a matter of some controversy—is that there is an actual or real distinction between essentia and esse. Or, rather, there is such a distinction for created things, which exist contingently. God, however, exists necessarily and from himself (a se); in God there is no distinction between essentia and esse.

Suárez, however, rejects a real distinction between esse and essentia and, furthermore, rejects a modal distinction. This is, in part, because of his understanding of real and modal distinctions: a real distinction would suggest that an essentia could exist without esse and esse without essentia, and a modal distinction would suggest at least the former (one source of confusion in these discussions is that a Thomist real distinction is probably not the same as a Suárezian real distinction). There were some who were willing to bite one or both of those bullets, but Suárez argues at length for their unacceptability. On Suárez’s view, then, even in created beings there is only a distinction of reason between esse and essentia. Since he is committed to the Christian doctrine of creation ex nihilo, the consequence is that an uncreated essentia is absolutely nihil, while, as expected, in a created essentia its essentia and esse are really the same, albeit conceptually distinct.

Much more would need to be said, and has been said by various scholars, to establish how significant this Suárezian claim is especially with respect to Aquinas’s position or even whether it is significant at all. Inter-school polemics and differing terminologies have resulted in more heat than light on this issue.

d. Efficient and Final Causation

Suárez’s extraordinarily detailed explication of the four kinds of Aristotelian causes—material, formal, efficient, and final—has received increasing attention in the early 21st century, perhaps in part because seven of the relevant disputations have finally been published in English translation. The resulting scholarly discussions have often been tied up with questions about Suárez’s place in the history of philosophy. Is he a loyal member of the medieval guild or is he setting the stage for mechanism or modernity? Or, perhaps both? On one reading of his account of formal causation, Suárez is the tragic hero making a valiant attempt to defend substantial forms, but in the course of doing so he alters the conception of substantial forms from the traditional model, thereby inadvertently making substantial forms more susceptible to mechanist critique and dismissal. If so, then he is a loyal member of the medieval guild and yet sets the stage for the modern mechanical philosophy. Not all scholars think it is so, however; some think he actually offers remarkably persuasive arguments on behalf of a more or less traditional conception of substantial forms, arguments to which the early modern mechanists would have done well to pay more attention.

Similar issues arise with Suárez’s account of final causality and its relationship to efficient causality. Ends are that for the sake of which an agent acts, as good health is the end for the sake of which a doctor prescribes medicine and finding small invertebrates to eat is the end for the sake of which curlews push their long curved bills into mud. Aristotle—and Aquinas—famously take ends to be a kind of cause, namely, final causes. Appeal to final causes is essential for offering complete explanations. In fact, Aquinas goes so far as to say that all the other kinds of causation, including efficient causation, presuppose final causation. Without final causation there would be no efficient causation on his view. Some scholars, however, have argued that Suárez departs from Aquinas on this score and prioritizes efficient causation, perhaps even reducing final causation to efficient causation. If so, this would make Suárez look like an intermediate between Aquinas’s position and the widespread dismissal of final causation in early modern philosophy.

At first glance, one might think Suárez straightforwardly endorses a traditional picture. He divides his discussion of causation according to the Aristotelian fourfold classification, explicitly defends the status of all four causes as real causes (DM 12.3), and, most importantly, defends at length the claim that ends are real causes (DM 23.1) and argues that final causation is present in the actions of God, rational created agents, and natural agents (that is, non-rational agents such as cows and oak trees).

But a closer look reveals some grounds for those wishing to argue that Suárez emphasizes efficient causation at the expense of final causation. First, he explicitly states that the definition of “cause” applies most properly to efficient causes (DM 12.3.3). Second, when talking about efficient causation he uses the term “real motion,” but when talking about final causation he uses the term “metaphorical motion.” Third, final causality depends on efficient causality, since an end is an actual final cause only if an efficient cause acts on its behalf. Fourth, an end is a final cause only if cognized by a rational agent (see DM 23.7 and 23.10.6); this stands in contrast with Aristotle’s confidence in final causation without the thought and intention of a rational agent. Besides, Suárez devotes far more pages to efficient causation than to final causation.

Nonetheless, scholars who wish to attribute a more traditional view to Suárez can also find support from a closer look at Suárez’s text. Taking the previous paragraph’s points in turn, one may first note that the term “cause” most properly applying to efficient causes is entirely compatible with the term properly applying to final causes, and that the significance of saying that the term applies most properly in one case but not the other is not immediately obvious. Second, it is true that Suárez uses the term “metaphorical motion” when describing the motion of final causes, but he is simply following well-entrenched terminological practice. Also, when pressed, he explicitly denies that metaphorical motion is so-called because it fails to be real motion (DM 23.1.14). Third, on Suárez’s view, actual final causation does seem to depend on efficient causation, but the converse is true as well. He grants Aquinas’s point that efficient causation presupposes final causation, that efficient causes would not act were they not to have ends for the sake of which to act. At the very least, there seems to be a sort of mutual dependence. In some passages, Suárez appears also to endorse the priority of final causation, though whether his view licenses that conclusion is a more complicated matter.

The fourth issue cannot be fully addressed without drawing in a number of other philosophical issues. One may briefly note, however, that, while Suárez does demand that ends be cognized in order to final-cause, he thinks this condition is satisfied even in the case of natural agents. This is a result of his concurrentism, according to which all actions of created things also have God as a concurring agent. That is, one and the same action has two agents, at least one of which is a rational agent. Of course, this account leaves final causation in the natural realm dependent on final causation in the divine realm. But final causation in the latter realm is not unproblematic. A central scholastic assumption, which Suárez shares, is that God is never subject to causation. So how can a final cause move God? And if it cannot, then how can natural actions inherit final causes from God? Suárez is well aware of this problem (DM 23.9.1). His response is to concede that there is no final causality in the case of God’s immanent actions, but that there is no problem with saying that God’s transeunt actions, that is, actions not located in God, have final causes. Whether this answer can be made to work in light of the details of his account of metaphorical motion in DM 23.4 is a further matter. With respect to the fourth issue, there is also a historical question whether the cognition requirement represents a change just from Aristotle or also a change from Aquinas.

These questions about how to fit Suárez’s account of causation into a broader history exemplify a common approach to Suárez. His place in time ensures that it is always tempting to read him as a transitional figure, as standing between the medieval view—where the “medieval view” typically means the view of Aquinas—and the views of the early modern mechanists. Consequently, a strand running through much Suárez scholarship concerns whether he in fact holds transitional views or not.

e. Existence of God

In DM 28, Suárez argues that the best primary division of being is into infinite and finite being, a division he considers equivalent to a number of other divisions including between necessary and contingent being, essential and participated being, and uncreated and created being. It is worth noting, however, that he does not take the term “being” to be used univocally when predicated of both infinite and finite being (DM 28.3). Nor does he go to the opposite extreme and consider it equivocal. Rather, he argues that “being” is used analogously in this case in virtue of the intrinsic characters of both infinite being and finite being.

As Suárez notes, the arguments of DM 28, assuming they work, already go a long way towards establishing the existence of God, since they purport to show that there must be some uncreated being. He devotes an entire disputation, however, to the question of God’s existence. His goal in DM 29 is to prove by natural reason, without any appeal to special revelation, that God exists, a goal that he thinks can be achieved.

His optimism about the possibility of demonstrating that God exists does not result in an uncritical attitude to previous efforts to do so. He rejects, for example, versions of the ontological argument that claim God’s existence to be evident from the fact that necessarily existing is part of what it means to be God (De Deo uno et trino 1.1.1.9). Of course, so did Aquinas. Perhaps more surprising, given Suárez’s debt to Aquinas, is that he also rejects the cosmological argument from motion made by Aristotle and made famous as Aquinas’s first way (Summa theologiae Ia.2.3). This argument starts from the motion or change that we observe, claims that whatever is moved is moved by another, but that there cannot be an infinite chain of moved movers, and so concludes that there must be an unmoved mover at the foundation. A key reason for Suárez to worry about this argument is that he not only thinks the status of the Aristotelian principle that whatever is moved is moved by another uncertain, he thinks that we ourselves provide counterexamples via our free actions. Consequently, he thinks the physical cosmological argument (“physical” because motion pertains to physics) relies on a false premise.

Instead, he turns to a metaphysical version of the cosmological argument (“metaphysical” because being is the object of metaphysics). This argument starts from the observation that there are things that exist, notes that every being either is made by something else or is not (that is, is created or is uncreated), argues that not every being can be made by another being, and concludes that there must be some being that is uncreated (DM 29.1.20-40). The alternatives to the claim that not every being can be made by another being would be either that there is an infinite chain of beings, each made by a prior being, or that there is a circle of beings, each making the next one in the circle. Suárez argues that these alternatives are impossible.

Long before Hume, Suárez recognizes that this cosmological argument hardly suffices to show that there is an uncreated being that merits being called God. Multiple worries might be raised, but Suárez focuses on the observation that for all that the cosmological argument shows, there might be many uncreated beings making other beings. In response, he moves to the next stage of his argument and enlists the aid of teleological arguments, arguing that attending to the order, structure, and beauty of the world shows that there is only one uncreated being (DM 29.2). He considers a variety of objections, ranging from the claim that the order only indicates at most that there is one governor of the world to the possibility of the world having been created and governed by a committee of uncreated beings working in consensus to the possibility that our world is only one of many worlds, each with its own uncreated creator. Suárez argues that some of these objections fail, but he concedes that the teleological or a posteriori argument he is considering cannot show that there are not other worlds with their own creators.

For the final stage, then, he turns to what he calls an a priori argument (DM 29.3). Strictly speaking, there can be no a priori arguments for God’s existence on the scholastic understanding of a priori arguments. For such arguments are arguments from causes to effects and God has no causes. Suárez accepts this point, but suggests that once we have an a posteriori demonstration of a divine attribute, it is possible to demonstrate a priori further attributes from that attribute (cf. the aforementioned example of using God’s perfection to demonstrate God’s unity. Suárez then proposes to demonstrate that there can be only one uncreated being from an uncreated being’s existing necessarily and a se (from itself). The resulting stretch of arguments is complex and relies on premises whose truth is not always obvious. Suárez himself is modest about the force of the argument, granting at the start that the proposed project is difficult and noting at the end that not all of the steps are immune to evasions. He does, however, think that the whole argument taken together will have some persuasive force for a reader who is not obstinate.

f. Categories and Genera

Thanks to the Aristotelian legacy, category theory was a prominent feature of scholastic philosophical discussions. Aristotle famously enumerated ten categories but left it unclear what the justification was for listing ten rather than fewer or more categories. A project that occasioned significant interest among medieval philosophers, then, was to provide the argument that would establish ten and only ten categories; such arguments were called sufficientiae. Deference to Aristotle was not universal, however, and so other philosophers argued that there are fewer than ten categories. There was also disagreement about what the categories are classifying. Extramental objects? Words? Concepts?

In these discussions, the terms “categories” (“praedicamenta”) and “highest genera” (“generalissima” or “suprema genera”) are often used interchangeably. At least some of the time, however, Suárez distinguishes between categories and genera and says that it is the business of logicians to deal with categories, since logic is concerned with the mind’s concepts, while it is the business of metaphysics to deal with the highest genera of beings, revealing their natures and essences (DM 39.pr.1). This suggests a view in which there is a kind of correspondence between the classification of concepts and the classification of extramental beings.

Be that as it may, Suárez qua metaphysician devotes the bulk of the second half of DM to dividing finite being (God falls outside the scope of this division) into the ten highest genera and discussing each genus in turn. Substance, of course, occupies a special role in Aristotelianism, and so Suárez first divides finite being into substance and accident and discusses substance at some length. He then turns to a discussion of the division of accidents, that is, the remaining nine genera on Aristotle’s view.

The first question he considers is whether nine genera of accidents—or ten genera in total—is too many (DM 39.1). He ends up affirming Aristotle’s number but gives a somewhat deflationary spin. He concedes that a variety of intermediate genera can be devised: real accidents versus accidents that are mere modes, absolute accidents versus respective accidents, and so forth. This concession raises questions about the significance of Aristotle’s ten genera, if they are not the most basic or immediate divisions. Suárez, however, thinks Aristotle’s division is nonetheless “most apt.”

The second question concerns the sufficiency of the nine genera of accidents (DM 39.2). Suárez divides this into two issues: (i) are there genera beyond these nine? and (ii) are all nine genera distinct from each other? As usual, Suárez has great respect for his philosophical forebears and says that it would be “rash” to doubt Aristotle’s number. That said, when he discusses the sufficientiae of Aquinas and Scotus, he is obviously dissatisfied. He concludes that a proper a priori demonstration of the sufficiency of the ten highest genera cannot be given. This, he claims, should be no surprise, since a science presupposes its subject rather than demonstrating it.

But what sort of distinction is there between the genera? One would be surprised to find that a given bird belongs to two different genera, say, Ara and Tangara. Similarly, one might be surprised to find that Aristotle’s genera overlap or even coincide entirely. Yet Suárez quickly rejects the view that there is always a real distinction between items belonging to two or more of the highest genera. He is more sympathetic to the view that there is a modal distinction. There are, however, cases that keep him from accepting that view as well.

Relations, the fourth highest genus, are of special concern. There is evidence that at one point in his career Suárez took relations to be modally distinct from their foundations. It is also a view to which he gives more time and attention when he gives an explicit treatment of relations in DM 47, though ultimately he rejects the view in that disputation. Instead, he concludes that relations are only distinct in reason from their foundations. For example, if Socrates and Plato are similar to each other in virtue of each being white, then the foundation for Socrates’s relation of similarity to Plato is Socrates’s whiteness. On Suárez’s mature view, Socrates’s similarity relation is neither really nor modally distinct from Socrates’s whiteness. The relation and quality are only conceptually distinct. There is a persuasive reason for such a reductionist account of relations. If God creates a world with nothing but substances and absolute qualities, it seems the relations would ipso facto follow. No separate creative act would be needed to ensure that white Socrates and white Plato were similar.

Consequently, Suárez concludes that a distinction of reason suffices for a separate highest genus (DM 39.2.22 and 47.2.22). Sometimes there in fact is a real distinction. Suárez thinks items in the second and third genera, quality and quantity, are really distinct from substance and from each other. In the remainder of the cases, however, there is either only a modal distinction or only a distinction of reason. He does stress that the distinction of reason needs to be one with a foundation in reality, which raises challenging questions about what that foundation is. The standard example of distinctions of reason with a foundation in reason in scholastic philosophy is the distinction between God’s attributes. That example may have been dialectically effective insofar as it was widely accepted, but it is not thereby an illuminating example. In the case of action and passion, two genera only distinct in reason, Suárez seems to suggest that the foundation in reality can be comparisons to extrinsic things. He says that action is distinct from passion insofar as action is compared to the principle that acts and passion to the principle that undergoes (DM 39.2.23).

It is worth noting in passing that Suárez thinks “being” is used analogically across the nine highest genera of accidents (DM 39.2.3). Consequently, one of the key texts relevant to the controversies about Suárez’s doctrine of the analogy of being is found in his discussion of the division of accidental being.

g. Beings of Reason

A moment’s reflection shows that we not only think and talk about existent objects such as stars and oak trees, we also think and talk about non-existent objects and even objects that could not exist, such as square circles or goat-stags. But, to use one of Suárez’s examples, what is one talking about if one says that two chimeras are similar but that goat-stags and chimeras are different? Ordinarily one might think that propositions are made true by beings. The proposition that many oaks have lobate leaves is made true by the many real oak trees with lobate leaves. But what makes the proposition that goat-stags and chimeras are different true? Besides, how can a thought be directed at one non-existent object rather than another?

Suárez calls such objects of thought “beings of reason,” and he ends DM with a disputation devoted to them (DM 54). The inclusion of this disputation might come as a surprise, given the systematic nature of the work and given that Suárez has carefully defined the object of metaphysics as “real being.” Well aware of his earlier definition, Suárez opens DM 54 with a prologue defending the treatment of beings of reason. Given that our thought inevitably involves beings of reason, someone needs to give an account of them. Suárez thinks the metaphysician is best suited to the task, since beings of reason are “shadows of being” and consequently should be treated in analogy to real being. Since real being is metaphysics’ object, metaphysicians are well-positioned to give an account of beings of reason also. Suárez’s is a controversial claim; some scholastics thought beings of reason the logician’s domain. Note, too, that the analogy between real being and being of reason is a rather weak one. Not only do the analogates not fall under any unitary concept, but beings of reason do not even have in themselves anything proportional to real beings (beings of reason have nothing in themselves). Suárez, however, points out that beings of reason are thought of as having a proportion to real beings, and insists that is sufficient for a kind of analogy of proportionality (DM 54.1.10).

As is his wont, Suárez attempts to thread a middle course. There are those who deny that there are beings of reason or that beings of reason are needed in order to teach about or conceive of real beings. On the other side are those who grant beings of reason and, furthermore, claim that there is a single concept of being that includes both real beings and beings of reason. Suárez’s own view is that there are beings of reason but that the “are” in that claim does not indicate the same thing as the “are” in the sentence “there are oak trees.” There is no single concept covering both real beings and beings of reason.

When first introducing his own view of beings of reason, Suárez characterizes them as “not true real beings, since they are not capable of true and real existence” (DM 54.1.4). It is worth noting the modal force in that characterization and recalling that for Suárez real being includes both actual and possible being. What Suárez has in mind when talking about beings of reason are things that cannot exist. That class turns out to be rather motley. Mythical beasts, negations, privations (for example, blindness), and even logical concepts such as genus and species are all beings of reason.

Suárez’s explicit definition comes two paragraphs later: “a being of reason is usually, and rightly, defined as that which has being only objectively in the intellect or as that which is thought by reason as a being even though it has no entity in itself” (DM 54.1.6; italics in the original). One issue in Suárez scholarship concerns whether the two disjuncts amount to equivalent definitions or not. If Suárez also thinks that it is possible to think of non-beings in the manner of non-beings, then it would look like there can be things that satisfy the first disjunct without satisfying the second disjunct. Another issue concerns how to understand the talk of being objectively in the intellect. On the traditional interpretation, Suárez’s view is that beings of reason have a peculiar mode of being, namely, as pure objects of thinking. Consequently, their being depends on minds actually thinking about them. That interpretation can and has been challenged, however, by a more eliminativist account that denies that “being only objectively in the intellect” is any sort of being at all. Rather, it just describes the state of affairs where an intellect has a contentful thought about something that simply fails to exist.

Most of the interest in Suárez’s account of beings of reason concerns his general characterization. DM 54, however, goes on to ask what sorts of causes, if any, beings of reason have and whether the traditional division of beings of reason into negations, privations, and relations of reason is right and sufficient. He answers that, rightly understood, it is.

Questions concerning beings of reason go back at least to the ancient Greek philosophers, but Suárez’s discussion is a landmark in its detail and systematicity. It also leaves some matters unclear, however, and of course other philosophers found things with which to disagree. The result was a train of similarly extended, sophisticated discussions of beings of reason after Suárez. Some offered what might be seen as developments of Suárez’s account, for example, Bartolomeo Mastri and Bonaventura Belluto, while others argues against Suárez and offered contrasting accounts, for example, Pedro Hurtado de Mendoza and John Punch.

h. Middle Knowledge, Grace, and Providence

As mentioned earlier, Suárez contributed to the raging debate about how to reconcile human free will with divine grace, foreknowledge, providence, and predestination that is known as the controversy De auxiliis. One of the developments for which he is best-known, albeit more in theological circles than philosophical circles, is a doctrine called Congruism. Suárez and Robert Bellarmine are the two Jesuits usually credited with formulating Congruism in detail.

To understand Congruism, it helps to step back and first look at the Molinism of which it is a species. A relatively straightforward way of reconciling human free will with the various theological doctrines in question is to provide a compatibilist account of free will, that is, an account of free will compatible with determinism. Luis de Molina, also a Jesuit, rejects that method, vigorously defending a libertarian account of free will. To be free means to be able to choose and able not to choose an option once all the prerequisites for acting have been posited. Related to this emphasis on libertarian freedom is the belief that the divine grace (whereby sinful humans can attain salvation) is not intrinsically efficacious. Rather, God’s grace is rendered efficacious by a human being’s free consent. These beliefs naturally raise questions about compatibility with traditional doctrines such as God’s foreknowledge and providential control. Molina famously proposes middle knowledge (scientia media) to show their compatibility. Middle knowledge—“middle” because it falls between the natural and free knowledge traditionally ascribed to God—is God’s prevolitional knowledge of what any possible free creature would do in any scenario. This ensures God’s foreknowledge, even of free actions, and allows God the appropriate providential control.

Suárez wholeheartedly agrees with Molina on the importance of libertarian freedom (DM 19.2-9). If, for example, God determines Peter to steal, then Peter cannot be held responsible for stealing. Suárez also agrees with the Molinist strategy of appealing to middle knowledge (De scientia Dei futurorum contingentium 2). There are, however, disagreements on this as well.

Molina argues that the reason God knows what creatures would freely do is because God’s infinitely surpassing the finite nature of creatures allows God to “super-comprehend” their natures, and thereby to know what they would do in given situations. Suárez, however, denies that any special explanation is needed for God’s middle knowledge. To know whether God can do something, we do not need to investigate God’s omnipotence. Rather, we merely need to establish that the thing is possible. Similarly, he thinks that to find out whether God can know something, we do not need to investigate God’s abilities. All we need to do is establish that the claim in question is true. If it is, then God’s omniscience ensures God’s knowledge of that claim. God knows propositions about what free creatures would do in the same way he knows any other proposition: by a simple intuition of its truth (De scientia Dei futurorum contingentium 1.8).

But Congruism’s main dispute with Molina concerns the reason for the efficaciousness of divine grace and the place of predestination. According to Molina, God bestows grace on, say, Mary and Martha, knowing that Mary will freely act so as to render the grace efficacious while Martha will not. Mary’s free acceptance is the sole extrinsic reason rendering the grace given to her efficacious, while Martha’s free failure to accept the grace is the sole extrinsic reason the grace given to her is sufficient rather than efficacious. Furthermore, God predestines Mary to salvation on the basis of his knowledge that she will freely accept his grace.

Suárez, however, attempts to carve out a position that is broadly Molinist but closer to the position of the Jesuits’ Thomist opponents. On his view, Mary’s free acceptance is not the only extrinsic reason rendering the grace efficacious. Nor does God predestine her to salvation on the basis of his middle knowledge. Rather, God antecedently elects some people to salvation but does not elect others. This election is gratuitous in the sense that it does not rest on God’s knowledge of any human merits. Having thus elected Mary, he then knows, thanks to his middle knowledge, what graces to bestow such that Mary will freely accept. If God had known that the grace given to Mary would not be such that Mary would freely accept it, then he would have given some other grace to her such that she would. In other words, the grace is efficacious not only because Mary freely accepts it, but also because of God’s antecedent decision to give whatever grace is needed to ensure Mary’s free acceptance, that is, his antecedent decision to give her a “congruous” grace (De concursu et auxilio Dei 3.14.9).

Suárez’s and Bellarmine’s Congruist version of Molinism was declared the official doctrine of the Jesuit order in 1613, supplanting Molina’s own version.

i. Natural Law and Obligation

A central question in the history of ethics goes back to Socrates in Plato’s Euthyphro: do the gods love the pious because it is pious or is it pious because the gods love it? Suárez directly addresses a variant of this question, albeit in a section whose title might not immediately reveal its philosophical significance: “Is the natural law truly a preceptive divine law?” (De legibus 2.6; henceforth, DL).

Scholastics customarily distinguish between natural and positive law. Explaining this distinction in neutral terms is made difficult by the widely varying theories of law given, but perhaps the central feature of natural law is an epistemological one. Natural law is law that is accessible to all human beings, regardless their access to some holy text or other special revelation. As Suárez puts it, natural law “is that law which sits within the human mind in order to distinguish the fine from the wicked” (DL 1.3.9 or 10, depending on edition). Other features often associated with natural law are universality (that is, applicability across times and places) and being grounded in nature rather than in an arbitrary will. Positive law, on the other hand, is in some sense arbitrary law that is added to natural law and that is not, in principle, accessible apart from special promulgation or communication. A paradigm example is a law that requires people to drive on the right-hand side of the road. One might well think the choice between that law and the law requiring people to drive on the left is arbitrary and one certainly cannot figure out which side of the road to drive on in a given country by introspecting. It is worth noting that Suárez recognizes two species of positive law: human and divine. In some contemporary discussions, positive law is characterized as human law and natural law as divine law. Understanding the terms in that way makes a hash of Suárez’s discussion, as well as those of other scholastics.

Now we can see why natural law is of special interest with respect to the question Socrates posed to Euthyphro. All scholastic theorists of law think that there are at least some cases where laws get their obligatory force solely from a legislator. That is what positive law is like, including divine positive law. A standard example is the ceremonial law of the Old Testament, which Christians thought obligatory in one time and place but not in another. The question is whether all law is like that. Natural law is, of course, the place to look if one is wondering if there are obligations that are not grounded in a superior’s command or prohibition.

Suárez stresses that the question to be asked is whether natural law is divine law in the sense that it is grounded in God qua legislator. He deems it entirely obvious that God is the cause of natural law in some sense, since God is the creator of everything, including any nature in which natural law might be grounded (DL 2.6.2). But even if God is the cause in that sense, there is still the question whether created nature already indicates what ought to be done and what ought to be avoided or whether a further legislative act prescribing or forbidding actions is needed. Suárez calls law in the former sense indicative law and law in the latter sense preceptive law.

One position is extreme naturalism or intellectualism, which Suárez attributes to Gregory of Rimini and several others (DL 2.6.3). On this view, no legislative act on God’s part is needed. Rather, natural law simply indicates what should be done or not done on the basis of what is intrinsically good or bad. Loss of life, for example, is bad: murdering King Duncan deprives him of life, and so Macbeth ought not to stab Duncan. On the extreme naturalism espoused by Gregory, Macbeth’s duty not to murder Duncan would obtain even if God had not given the Ten Commandments and even if God had not existed at all.

On the other side is an extreme voluntarism that says that natural law consists entirely in a command or prohibition coming from God’s will, a view that Suárez attributes to William of Ockham (DL 2.6.4). On this view, what one ought or ought not to do is wholly determined by God’s legislative acts and, furthermore, God’s legislative acts are unconstrained. That is, there is no act that is intrinsically bad such that God is compelled to prohibit it or even prevented from commanding it and no act that is intrinsically good such that God is compelled to command it. Had God commanded us to murder and steal, then doing so would have been obligatory and good.

Characteristically, Suárez charts a middle course. He first agrees with the extreme voluntarists that natural law is genuinely preceptive law, and argues that for a law to be genuine law and not just law in name it must be grounded in the legislative act of a superior (DL 2.6.5-10). The obligatory force of natural law comes from God’s will. Contra Gregory of Rimini, that obligation would not be present had God not legislated or not existed at all.

But then comes the crucial qualification that ends Suárez’s agreement with extreme voluntarism: “Second, I say that this will of God—that is, this prohibition or precept—is not the whole reason for the goodness and badness that is found in observing or transgressing the natural law, but that the natural law presupposes in the acts themselves a certain necessary fineness or wickedness and adjoins to these a special obligation of divine law” (DL 2.6.11). The extreme voluntarist thinks that God is free to command as he wishes, unconstrained by the natures of things. Suárez, on the other hand, thinks that God’s commands and prohibitions are constrained by natural goodness and badness. As befits a perfect being, God prohibits some actions precisely because they are evil. Suárez thinks it absurd to suggest that there are no actions such that they are too evil for God to command or even just to permit. To this extent, then, Suárez agrees with the naturalist; the obligations of natural law are rooted in natural goodness and badness.

That Suárez attempts to chart a middle course of this sort is uncontroversial. There are, however, controversies about how much Suárez allows on the natural, pre-legislative side, as well as how what is allowed on the natural side relates to the “special obligation” resulting from God’s legislative acts. On one interpretation, Suárez’s account is incoherent, because he gives a voluntarist account of obligation and yet grants that performing an action naturally bad is sufficient for blameworthiness. Since being blameworthy requires violating an obligation, Suárez is thereby implicitly committed to saying that natural badness is sufficient to give rise to obligations. But this contradicts his voluntarist account of obligation. One way to avoid the charge of incoherence would be to understand Suárez as giving a voluntarist account of the “special obligation” imposed by natural law, and understanding that obligation as an additional obligation. In other words, the natural goodness and badness intrinsic to actions is sufficient to give rise to one sort of obligation and consequently sufficient to make agents who observe the obligations praiseworthy and agents who violate them blameworthy. Natural law then adds a further obligation to that natural obligation in the same way that human legislators might add further obligation by prohibiting what is already morally prohibited. As Suárez notes, there is nothing incoherent about adding one obligation to another (DL 2.6.12).

Noting that Suárez uses both the terms “duties” (“debita”) and “obligations” (“obligationes”), an alternative strategy is to interpret him as giving a voluntarist account of all obligation while granting that natural goodness and badness give rise to duties. In other words, natural goodness and badness would be sufficient to give rational agents duties to act in certain ways and not in other ways, but none of this would count as genuine obligation. Obligation, on this view, is a peculiar sort of force that arises from someone with legitimate authority issuing a command or prohibition to some subject to that authority. A task for defenders of this interpretation is to spell out what duties and obligations are, such that being subject to a duty is not sufficient for being under obligation.

j. Political Authority

One could hold the view that what gives some individuals political power over other people is that God bestowed such authority on them directly. Suárez rejects that view. He insists that men are by nature free and subject to no one (DL 3.1.1). (I shall use the term “men” in this section, since Suárez does not grant the same natural liberty to women and children. See DL 3.2.3.) Europeans had, of course, stumbled into the Americas shortly before Suárez’s time and so one of the questions that exercised his contemporaries concerned the standing of Native Americans, especially in light of Aristotle’s infamous claim that some human beings are natural slaves. Suárez has no use for the suggestion that Native Americans are natural slaves. Men are naturally disposed to be free and being free is one of their perfections; suggesting that all the people in some region or other happen to have been born “monsters,” that is, with defective natures, is incredible.

Suárez does, however, think that some rulers have legitimate authority over men. Where does that authority come from? The short answer is that political communities are needed and so men consent to join together in such communities, and in a political community the power to govern and to look after the common good of the community must be vested in an authority. Suárez gives two primary reasons why political communities are needed (DL 3.1.3). First, he agrees with Aristotle that human beings are social animals that desire to live in community. The most natural community is a family, but this is an imperfect community, insufficient to include within it all the skills and knowledge needed for life. Consequently, multiple families need to join together in a perfect (that is, complete) community. Second, anticipating Hobbes’ famous point, Suárez notes that individual families not joined together in a political community would be unlikely to remain at peace and would have no means of averting or avenging wrongs (see also Defensio fidei catholicae 3.1.4).

A lively question in Suárez’s day was whether human beings in the state of innocence would have lived in political communities or whether such communities only became necessary after the Fall with its introduction of sin. The second reason given above, in particular, would not seem to apply in the state of innocence. Suárez, however, is confident that political communities would have formed even in the state of innocence, had it continued (De opere sex dierum 5.7). Human beings are social animals, whether in the state of innocence or not. Furthermore, Suárez thinks that even in the state of innocence some people would surpass others in virtue and knowledge. So even though joining together in a political community might not be necessary for mere survival in that state, it would, nonetheless, be most useful for the sharing of knowledge and for encouragement to greater virtue.

That political community requires some authority to govern it Suárez takes as well-nigh self-evident (DL 3.1.4-5). He does note, however, that one reason a governing authority is needed is because each individual of a political community looks after his or her own good. Individuals’ goods, however, sometimes conflict with the common good. Consequently, a government looking after the common good is needed. Following Aristotle, Suárez thinks that a monarchy best fills the role of governing authority (DL 3.4.1). He does not think that monarchy is dictated by natural law, however, and grants that other forms of government, including democracy, may be “good and useful.” Which form of government to adopt is left to human choice.

A political community does not result merely from a group of families living in proximity, even if they consequently become familiar friends. The formation of a political union requires an “explicit or tacit pact” to help each other and a subordination of the families and individuals to some governing authority (De opere sex dierum 5.7.3; cf. DL 3.2.4). The details of how this works are subject to scholarly dispute. Some commentators argue that on Suárez’s view, the community’s consent creates a political community but does not directly cause obligation to a political authority. Others argue that that the consent does directly cause the obligation and authority.

An important feature of Suárez’s view is that political power does not just reside in the community initially. It always remains there. As he puts it, “after that power has been transferred to some individual person, even if it has been passed on to a number of people through various successions or elections, it is still always regarded as possessed immediately by the community” (DL 3.4.8). Suárez is, of course, aware that the needed stability of political communities would be in question if communities could withdraw their transfer of power to the government at every whim. So even though in some sense the power always remains in the community, Suárez argues that the transferred power may not ordinarily be withdrawn (DL 3.4.6). Suárez recognizes exceptions, however. Should the government become tyrannical, the door may be opened to legitimate revolt and even tyrannicide (Defensio fidei catholicae 6.4 and De charitate 13.8). This is the doctrine that gained Suárez the ire of James I of England.

4. Legacy

Although largely unknown among non-specialists (at least in Anglo-American philosophy), Suárez’s influence has never been in doubt among historians of early modern philosophy. There are difficulties with establishing the extent of his influence. First, many of the canonical early modern figures seldom cite their sources. Descartes is perhaps best-known for this, but citations are hardly abundant in any of the other extrascholastic early moderns such as Spinoza, Malebranche, and Locke. The absence of citations is, of course, especially striking in comparison to the texts of Suárez and his fellow scholastics, which are replete with them. Second, encountering an idea or term in a modern text that looks very much like something in Suárez is insufficient to establish Suárezian influence, since there were hundreds of other scholastic theologians and philosophers, many of them also quite influential and many, especially Suárez’s Jesuit confrères, saying things more or less similar to what Suárez is saying and making use of the same terms and distinctions standard in scholastic circles.

Consequently, there is substantial debate about just how indebted Descartes, for example, is to Suárez. Some historians emphasize that his philosophical formation would have occurred in his education at the Jesuit La Flèche and that he himself writes in a letter that the first seeds of everything he learned came from the Jesuits. Other historians, however, point out that in a different letter he claims to remember only two Jesuits, Antoñio Rubio and Francisco de Toledo, and that he generally does seem to have been a very attentive reader.

However much uncertainty there may be about the extent of Suárez’s influence, it is certain that Hobbes, Descartes, Malebranche, Leibniz, and Berkeley all mention Suárez explicitly at least once and say a variety of things that might well be thought to borrow from or be inspired by Suárez. Wolff, too, cites Suárez and thought highly of him, to the extent that Wolff has been characterized as begotten of Suárez. Insofar as there is a path from Wolff to Kant, there might, then, reasonably be thought a path from Suárez to Kant as well.

The story of influence just told is the one most frequently encountered. It omits, however, Suárez’s main influence. Due to the vagaries of academic fashion and dubious historiographies, large swaths of early modern philosophy receive virtually no attention today. Yet Suárez was most influential in these neglected realms, which saw the rise of a Suárezian school of philosophy.

Suárez and Gabriel Vasquez were seen as rival fathers of Jesuit theology and philosophy, leading to near endless discussion of Suárez’s views by early modern Jesuits. Pedro Hurtado de Mendoza and Rodrigo de Arriaga, for example, discuss Suárez extensively in their own work (in fact, they are sometimes treated as faithful Suárezians, although that is a mistake). It must be remembered, too, that the Jesuits were active worldwide, leading to a remarkably wide dissemination of Suárez’s views. His influence was perhaps most profound in the early modern universities of Latin America, but Jesuit missionaries also spread his work to Africa and Asia, even starting a Chinese translation of DM in the seventeenth-century.

Outside the Jesuit order, Suárez’s stature in Scotism—at its height in early modern Europe—is also striking. Texts by Scotist authors often include frequent and detailed discussion of Suárez’s views. His influence even transcended the main religious division of modern Western Europe. Protestant scholastics such as Francis Turretin and David Hollaz, departing from Luther’s contempt for scholasticism, borrowed freely from Suárez. Suárez is sometimes described as providing the received metaphysics for seventeenth-century Lutheran universities.

Contemporary readers usually come to Suárez via the canonical early modern philosophers such as Descartes and Leibniz, but noting these additional lines of influence does two things. In the first place, it helps provide a more accurate picture of the extent of Suárez’s influence. But, second, it also gives some indication of how many philosophers and theologians identified in Suárez a thinker of the highest caliber, with whose work it was worth engaging at length.

5. References and Further Reading

a. Primary Sources

  • Suárez, Francisco. Opera omnia. Paris: Louis Vivès, 1856-78.
    • This standard edition is the most readily available, including freely online. It is not, however, a critical edition and does not include quite all of Suárez’s works. All major philosophical works are included. No English translation of a complete work has been published. Significant portions of some works, especially Disputationes metaphysicae (DM), have been published, however, and additional translations in various stages of polish are available online.

b. Secondary Sources

This abbreviated bibliography focuses on English works published in the last several years.

  • Doyle, John P. Collected Studies on Francisco Suárez, S. J. (1548-1617). Edited by Victor M. Salas. Leuven: Leuven University Press, 2010.
    • Doyle has perhaps done more than anyone else for Suárez studies in the U.S.A. This is a collection of essays drawn from forty years of work. The theme that receives the most attention is Suárez’s conception of metaphysics and being, but the volume also includes several papers on Suárez’s account of law and human rights.
  • Fichter, Joseph H. Man of Spain: Francis Suárez. New York: MacMillan, 1940.
    • Still the standard English biography of Suárez, although it is rather hagiographical by contemporary standards.
  • Freddoso, Alfred J. “God’s General Concurrence with Secondary Causes: Why Conservation Is Not Enough.” Philosophical Perspectives 5 (1991): 553-85.
    • A superb account of Suárez’s arguments in DM 22 against mere conservationism, that is, his arguments for God’s constant concurrence with the actions of created things.
  • Freddoso, Alfred J. “Introduction: Suárez on Metaphysical Inquiry, Efficient Causality, and Divine Action.” On Creation, Conservation, and Concurrence: Metaphysical Disputations 20-22, xi-cxxi. By Francisco Suárez. South Bend: St. Augustine’s Press, 2002.
    • The focus is on Suárez’s account of creation, conservation, and concurrence, but the incisive introduction to Suárez’s metaphysics in general and account of efficient causation more particularly makes this an especially valuable essay.
  • Gracia, Jorge J. E. “Suárez’s Conception of Metaphysics: A Step in the Direction of Mentalism?” American Catholic Philosophical Quarterly 65.3 (1991): 287-309.
    • Argues for a realist interpretation of Suárez’s account of metaphysics.
  • Gracia, Jorge J. E. “Francisco Suárez: The Man in History.” American Catholic Philosophical Quarterly 65.3 (1991): 259-66.
    • A brief, accessible introduction to Suárez, setting him in his historical context.
  • Heider, Daniel. Universals in Second Scholasticism: A Comparative Study with Focus on the Theories of Francisco Suárez S. J. (1548-1617), Joao Poinsot O. P. (1589-1644), and Bartolomeo Mastri da Meldola O. F. M. Conv. (1602-1673)/Bonaventura Belluto O. F. M. Conv. (1600-1676). New York: John Benjamins Publishing Company, 2014.
    • The title is an accurate guide. Exemplary historical scholarship, but challenging reading for those unfamiliar with scholastic terminology.
  • Hill, Benjamin, and Henrik Lagerlund, eds. The Philosophy of Francisco Suárez. Oxford: Oxford University Press, 2012.
    • One of the first volumes that a student of Suárez should turn to; includes papers on a variety of topics.
  • Novotný, Daniel D. Ens rationis from Suárez to Caramuel: A Study in Scholasticism of the Baroque Era. New York: Fordham University Press, 2013.
    • A fine study of Suárez’s account of beings of reason, followed by discussions of the accounts offered subsequently by Hurtado, Mastri/Belluto, and Caramuel. Novotný’s work is marked by both erudite historical scholarship and keen philosophical analysis.
  • Penner, Sydney. “Suárez on the Reduction of Categorical Relations.” Philosophers’ Imprint 13 (2013): 1-24.
    • Argues that Suárez gives a realist but reductionist account of relations, albeit with some problematic results.
  • Perler, Dominik. “Suárez on Consciousness.” Vivarium 52.3-4 (2014): 261-86.
    • An illuminating examination of Suárez’s account of our access to our own acts of perception and thinking. Looks at his distinction between first-order sensory consciousness and second-order intellectual consciousness, as well as what explains the unity of consciousness.
  • Salas, Victor and Robert Fastiggi, eds. A Companion to Francisco Suárez. Brill’s Companions to the Christian Tradition 53. Leiden: Brill, 2015.
    • Covers a number of areas that are neglected by the Schwartz and Hill/Lagerlund collections.
  • Schwartz, Daniel, ed. Interpreting Suárez: Critical Essays. Cambridge: Cambridge University Press, 2012.
    • An excellent set of essays on Suárez, treating a selection of key topics.
  • Shields, Christopher. “Virtual Presence: Psychic Mereology in Francisco Suárez.” In Partitioning the Soul: Ancient, Medieval, and Early Modern Debates, edited by K. Corcilius and D. Perler, 199-219. Berlin: W. de Gruyter, 2014.
    • Examines Suárez’s account of the soul and its parts and what the talk of parts comes to.
  • Shields, Christopher and Daniel Schwartz. “Francisco Suárez.” Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. 2014. Accessed 30 Mar. 2015.
    • A fine survey of Suárez’s life and philosophy that covers some topics neglected in the present entry.

 

Author Information

Sydney Penner
Email: sfp@sydneypenner.ca
Asbury University
U. S. A.

Scientific Representation

To many philosophers, our science is intended to represent reality. For example, some philosophers of science would say Newton’s theory of gravity uses the theoretical terms ‘center of mass’ and ‘gravitational force’ in order to represent how a solar system of planets behaves—the changing positions and velocities of the planets but not their color changes. However, it is very difficult to give a precise account of what scientific representation is. More precisely, though, scientific representation is the important and useful relationship that holds between scientific sources (for example, models, theories, and data models) and their targets (for example, real-world systems, and theoretical objects). There is a long history within philosophy of describing the nature of the representational relationship between concepts and their objects, but the discussion on scientific representation started in the 20th century philosophy of science.

There are a number of different questions one can ask when thinking about scientific representation. The question which has received the most attention, and which will receive the most attention here, is what might be called (following Callendar and Cohen 2006, 68) the “constitution question” of scientific representation: “In virtue of what is there representation between scientific sources and their targets?” This has been answered in a wide variety of ways, some arguing that it is a structural identity or similarity which ensures representation while others argue that there is only a pragmatic relationship. Other questions about scientific representation relate more specifically to the ways in which representations are used in science. These questions are more typically asked directly about certain sorts of representational objects, especially scientific models, as well as from the perspective of sociology of science.

Table of Contents

  1. Substantive Accounts
    1. Structuralist Views
      1. Isomorphism
      2. Partial Isomorphism
      3. Homomorphism
    2. Similarity Views
    3. Critiques of Substantive Accounts
  2. Deflationary and Pragmatic Accounts
    1. DDI
    2. Inferential
      1. Suárez
      2. Contessa
    3. Agent-Based Versions of Substantive Accounts
      1. Agent-Based Isomorphism
      2. Agent-Based Similarity
    4. Gricean
    5. Critiques of Deflationary and Pragmatic Accounts
  3. Model-Based Representation
    1. Models as Representations (and More)
    2. Model-Building
    3. Idealization
  4. Sociology of Science
    1. Representation and Scientific Practice
    2. Circulating Reference
    3. Critiques of Sociology of Science
  5. References and Further Reading

1. Substantive Accounts

Scientific representation became a rising topic of interest with the development of the semantic view of theories which was itself developed partly as a response to the syntactic view of theories. Briefly, on the syntactic view, theoretical terms are defined in virtue of relationships of equivalence with observational entities (Suppe 1974). This was done through the creation of a first-order predicate calculus which contained a number of logical operators as well as two sets of terms, one set filled with theoretical terms and the other with observational terms. Each theoretical term was defined in terms of a correspondence rule linking it directly to an observational term. In this way a theoretical term such as ‘mass’ was given an explicit observational definition. This definition used only phenomenal or physical terms [such as “Drop the ball from the tower in this way” and “observe the time until the ball hits the ground”] plus logical terminology [such as ‘there is’ and ‘if…then’]; but the definition did not use any other theoretical terms [such as ‘gravitational force’ or ‘center of mass’]. The logical language also included a number of axioms, which were relations between theoretical terms. These axioms were understood as the scientific laws, since they showed relationships that held among the theoretical terms. Given this purely syntactic relationship between theory and observed phenomena, there was no need to give any more detailed account of the representation relationship that held between them. The correspondence rule syntactically related the theory with observations.

The details of the rejection of the syntactic view are beyond the scope of this article, but suffice it to say that this view of the structure of theories was widely rejected. With this rejection came a different account of the structure of theories, what is often called the semantic view. Since there was no longer any direct syntactic relationship between theory and observation, it became of interest to explain what relationship does hold between theories and observations, and ultimately the world.

Before examining the accounts of scientific representation that arose to explain this relationship, we should get a basic sense of the semantic view of theories. The common feature of the semantic approach to scientific theories was that they should not be thought of as a set of axioms and defined syntactic correspondence between theory and observation. Instead, theories are “extralinguistic entities which may be described or characterised by a number of different linguistic formulations” (Suppe 1974, 221). That is to say, theories are not tied to a single formulation or even to a particular logical language. Instead theories are thought of as being a set of related models. This is better understood through Bas van Fraassen’s (1980) example.

Van Fraassen (1980, 41-43) asks us to consider a set of axioms which are constituents of a theory which will be called T1:

A0 There is at least one line.

A1 For any two lines, there is at most one point that lies on both.

A2 For any two points, there is exactly one line that lies on both.

A3 On every line there lie at least two points.

A4 There are only finitely many points.

In Figure 1, we can see a model which shows that T1 is consistent, since each of the axioms is satisfied by this model.

Figure 1

Figure 1 – Model of Consistency of T1

Notice this is just one model which shows the consistency of T1, since there are other models which could be constructed to satisfy the axioms, like van Fraassen’s Seven Point Geometry (1980, 42). Note that what is meant here by ‘model’ is whatever “satisfies the axioms of a theory” (van Fraassen, 1980, 43). Another, perhaps more intuitive, way of expressing this is that a model for any theory T is any model which would make T true iff the model were the entirety of the universe. For example, if Figure 1 were the entirety of the universe, then clearly T1 would be true. Notice also that, on the semantic view, the axioms themselves are not central in understanding the theory. Instead, what is important in understanding a theory is understanding the set of models which are each truth-makers for that theory, insofar as they satisfy the theory.

This account of the structure of theories can be applied to an actual scientific theory, like classical mechanics. Here, following Ronald Giere (1988, 78-79), we can take up the example of the idealized simple systems in physics. These are, he argues, models for the theory of classical mechanics. For example, the simple harmonic oscillator is a model which is a truth-maker for (part of) classical mechanics. The simple harmonic oscillator can be described as a machine: “a linear oscillator with a linear restoring force and no others” (Giere 1988, 79); or mathematically: F = -kx. This model, were it the entirety of the universe, would make classical mechanics true.

The targets of theoretical models on the semantic view are not always real world systems. On some views, there is at least one other set of models which serve as the targets for theoretical models. These are variably called empirical substructures (van Fraassen 1980) or data models. These are ways of structuring the empirical data, typically with some mathematical or algebraic method. When scientists gather and describe empirical data, they tend to think of and describe it in an already partially structured way. Part of this structure is the result of the way in which scientists measure the phenomena while being particularly attentive to certain features (and ignoring or downplaying others). Another part of this structuring is due to the patterns seen in the data which are in need of explanation. On some views, most notably van Fraassen’s (1980), the empirical model is the phenomenon which is being represented. That is to say, there is no further representational relationship holding between data models and the world, at least as scientific practice is concerned (for a discussion of this, see Brading and Landry 2006). Others argue that the relationship between theoretical models and data models is only one of a number of interesting representational relationships to be described, which set themselves up in a hierarchical structure (French and Ladyman 1999, 112-114).

With this semantic account of the structure of scientific theories in place, there arose an interest to give an account of the representational relationship. The views which arose with the semantic view of theories are here called “substantive,” because they all attempt to give an account of the representational relationship which looks to substantive features of the source and target. Another way of putting this (following Knuuttila 2005) is to say that the substantive accounts of representation seek to explain representation as a dyadic relationship which holds between only the source and the target. As will be discussed below, this is different from the deflationary and pragmatic accounts which view scientific representation as at least a triadic relationship insofar as they add an agent to the relationship. There are two major classifications of substantive accounts of the representational relationship. The first are the structuralist views which are divided into three main types: isomorphism, partial isomorphism, and homomorphism. The second category is the similarity views.

a. Structuralist Views

Generically, the structuralist views claim that scientific representation occurs in virtue of what might be called “mapping” relationships that hold between the structure of the source and the structure of the target, i.e. the parts of the theoretical models point to the parts of the data models.

i. Isomorphism

Isomorphism holds between two objects provided that there is a bijective function—that is, both injective (or one-to-one) and surjective (or onto)—between the source and the target. Formally, suppose there are two sets, set A and set B. Set A is isomorphic to set B (and vice versa) if and only if there is a function, call it f, which could be constructed between A and B which would take each member of set A and map it to one and only one member of set B such that each member of set B is mapped.

To make the point more clear, let us suppose that set A is full of the capital letters of the English alphabet and set B is full of the natural numbers 1 through 26. We could create a function which, when given a letter of the alphabet, will output a number. Let’s make the function easy to understand and let f(A) = 1, f(B) = 2, and so on, according to typical alphabetical order. This function is bijective because each letter is mapped to one and only one number and every number (1 – 26) is being picked out by one and only one letter. Notice that since we can draw a bijective function from the letters to the numbers, we can also create one from the numbers to the letters: most simply, let f’(1) = A, f’(2) = B, and so on. Of course, there is nothing apart from the ease of our understanding which requires that we link A and 1, since we could have linked 1 with any letter and vice versa.

Isomorphism has frequently been used to explain representation (see van Fraassen 1980, Brading and Landry 2006). Since theories, on the semantic view, are a group of related models, there is a certain sort of structure that each of these models has. Most of the time, they are thought of as mathematical models though they need not be only mathematical as long as they have a structure. Van Fraassen also identifies what he calls “appearances,” which he defines as “the structures which can be described in experimental and measurement reports” (1980, 64). So, the appearances are the measurable, observable structures which are being represented (the targets of the representation). On van Fraassen’s account (1980), a theory will be successfully representational provided that there is an isomorphic relationship between the empirical substructures (the sources) and the appearances (as targets), and an isomorphic relationship between the theoretical models (as sources) and the empirical substructures (as targets). (Or at any rate, this is how he has commonly been interpreted (see Ladyman, Bueno, Suárez, and van Fraassen 2010)). As described by Mauricio Suárez (2003, 228), this isomorphism between the models shows that there is an identity that holds between the “relational framework of the structures” of the source and the target. And it is this relational framework of structures which is being maintained.

So, on the isomorphism view of scientific representation, some scientific theory represents some target phenomena in virtue of a bijective mapping between the structures of a theory and data, and a bijective mapping between the data and the phenomena. Notice that on the isomorphic view, the bijections which account for representation are external to the theoretical language. That is to say that the relationship that holds between the theory and the phenomena is not internal to the language in which the theory is presented. This is an important feature of this account because it allows for a mapping between very different kinds of structures. Presumably the (mainly mathematical) structures of theories are quite different from the structures of data models and are certainly very different from the structures of the phenomena (because the phenomena are not themselves mathematical entities). However, since the functions are external, we can create a function which will map these very different types of structures to one another.

ii. Partial Isomorphism

Isomorphism has much to suggest for it, especially when focusing in particular on those theories which are expressed mathematically. This is especially true in more mathematically-driven fields like physics. It seems that the mathematical models in physics are representing the structure that holds between various real world phenomena. For example, F=ma represents the way in which certain features of an object (its mass, the rate at which it is being accelerated) correspond to other features (its force). However, many philosophers (for example, Cartwright 1983 and Cartwright, Shomar, and Suárez 1995) have pointed out that there are cases where a theory or model truly represents some phenomena, even though there are features of the phenomena which do not have any corresponding structure in the theory or model, due to abstraction or idealization.

Take a rather simple example, the billiard ball model of a gas (French and Ladyman 1999). Drawing on Mary Hesse’s (1966) important work on models, French and Ladyman argue that there are certain features of the model which are taken to be representative, for example, the mass and the velocity of the billiard balls represent the mass and velocity of gas atoms. There are also certain features of the billiard balls which are non-representational, for example, the colors of the balls. Most importantly, though, as a critique of isomorphism, there are typically also some undetermined features of the balls. That is to say, for some of the features of the model, it is unknown whether they are representational or not. (For a more detailed scientific example, see Cartwright, Shomar, and Suárez 1995).

To respond to problems of this sort, many (French and Ladyman 1999; Bueno 1997; French 2003; da Costa and French 2003) have argued for partial isomorphism. The basic idea is that there are partial structures of a theory for which we can define three sets of members for some relation. The first set will be those members which do have the relevant relation, the second set will be those members which do not have that relation, and the third will be those members for which it is unknown whether or not they have that relation. It is possible to think of each of these sets of individuals as being a relation itself (since a relation, semantically speaking, is extensionally defined), and so we could draw a bijective function between these relations. But, as long as the third relation (the third set of individuals for which it was unknown whether or not they had the relation) is not empty, then the isomorphism will be only partial because there are some relations for which we are unsure whether or not they hold in the target.

As a more concrete example, consider the billiard ball and atom example from above. In order for there to be a partial isomorphism between the two, we must be able to identify two partial structures of each system, that is, a partial structure of the billiard ball model and a partial structure of the gas atoms. Between these partial structures, there must be a bijective function which maps relations of the model to relations of the gas-system. For example, the velocity of the billiard balls will be mapped to the velocity of respective atoms. There must be a second function which maps those non-representational relations of the model to features of the gas-system which are not being represented. For example, a non-representative feature of the model, like the color of the billiard balls, will be mapped to some feature of the system which is not being represented, like the non-color of the atoms. All the same, this will still remain partial because there will be certain relations that the model has which are unknown (or undefined) in relationship to the gas-system.

iii. Homomorphism

Homomorphism, defended by Bartels (2006), is more general than isomorphism insofar as all isomorphisms are homomorphisms, but not all homomorphisms are isomorphisms. Homomorphisms still rely on a function being drawn between two sets, but they do not require that the function be bijective; that is, the function need not be one-to-one or onto. So, this means that not every relation and part of the theory must map on to one and only one relation or part of the target systems. Additionally, this permits that there be parts and relations in the target system which are unmapped. Homomorphisms allow for a great deal of flexibility with regard to misrepresentations.

b. Similarity

Isomorphism (and the other -morphisms) places a fairly strict requirement on the relevant constitutive features of representation, which are on these views structural. But, as Giere points out (1988, 80-81), this is often not the relevant relationship. Oftentimes, scientists are working with theories or models which are valuable not for their salient structural features, but rather for some other reason. For example, when modeling the behavior of water flowing through pipes, scientists often model the water as a continuous fluid, even though it is actually a collection of discrete molecules (Giere 2004). Here, the representational value of the model is not between the structure of the model and the structure of the world (since water is structurally not continuous, but rather a collection of discrete molecules). Instead, the relevant representational value comes from a more general relationship which holds between the behavior of the modeled and real world systems. Giere suggests that what is needed is “a weaker interpretation of the relationship between model and real system” (1988, 81). His suggestion is that we explain representation in virtue of similarity. On his account a model will represent some real world system insofar as it is similar to the real world system. Notice that this is a much weaker account of representation than the structural accounts, since similarity includes structural similarities, and so encompasses isomorphism, partial isomorphism, and homomorphism.

Of course, if we try hard enough, we can notice similarities between any two objects. For example, any two material objects are similar at least insofar as they are each material. Thus, Giere suggests that an account of scientific representation which appeals to similarity requires an “implicit” (or explicit) “specification of relevant respects and degrees” (81). Respects indicate the relevant parts and ways in which the model is taken to be representative. Perhaps it is some dynamical relationship expressed in an equation; perhaps it is some physical similarity that exists between some tangible model and some target object (for example, a plastic model of a benzene ring); perhaps it is the way in which two parts of a model are able to interact with one another, which shows how two objects in the target system might interact (like the relevant behavior of the model of water flowing through pipes). The limitations with regard to claims of the respects of similarity are limited only by what scientists know or take to be the case about the model and the target system. For example, a scientist could not claim that there was a similarity between the color of a benzene model and a benzene ring since benzene rings have no color. Similarly, a scientist could not claim that there is similarity between the color of a mathematical model and the color of a species of bacteria since a mathematical model does not have any color. Notice that it is insufficient to merely specify the respects in which a model is similar since similarity can come in degrees. Of course, there is a whole spectrum of degrees of similarity on which any particular similarity can fall. A source can be anywhere from an extremely vague approximation of its target to being nearly identical to its target (what Giere calls “exact” (1988, 93)) and everywhere in between.

Giere’s own example is that, “The positions and velocities of the earth and moon in the earth-moon system are very close to those of a two-particle Newtonian model with an inverse square central force” (1988, 80). Here, the relevant respects are the position and velocity of the earth and moon. The relevant degree is that the positions and velocities in the earth-moon system are “very close” to the two-particle Newtonian model. These respects and degrees thus give us an account of how we should think of the similarity between the model and the target system.

Giere uses similarity to describe the relationship between models and the real-world systems they represent, and sometimes between different models (one model may be a generalization of another, and so on). Theories themselves are constituted by a set of these models as well as some hypotheses that link the models to the real world which define the respect and degree of the similarity between the models and their targets.

More recently, Michael Weisberg (2013) has argued for a similarity account of representation. In brief, his view argues that two sets of things be distinguished in both source and target: the attributes and the mechanisms. In distinguishing these sets, an equation can be written in which the common attributes and mechanisms can be thought of as the intersection of the attributes of the model and of the target system, and the intersection of the mechanisms of the model and target system. The dissimilarities can also be identified in a similar fashion. He adds some terms to these sets which are weighting terms and functions. These allow the users to indicate which similarities are more important than others. Rewriting the equation as a ratio between similarities and dissimilarities will result in a method by which we can make comparative judgments about different models. In this way, we will be able to say, for example, that one model is more or less similar than another.

c. Critiques of Substantive Accounts

While similarity and isomorphism continue to have some support in the contemporary literature (especially in modified versions; see below, section 3c), the versions described above have faced serious criticisms. One of the most common arguments against the substantive views is that they are unable to handle misrepresentations (Suárez 2003, 233-235; Frigg 2006, 51). Many models in science do not accurately reflect the world, and, in fact, the model is often viewed as particularly useful because of (not in spite of) the misrepresentations. Nancy Cartwright (1983) has famously argued for a fictional account of modelling and made this case for the laws of physics. Others have shown that similar things are true in other scientific domains (Weisberg 2007a). When the theories are intentionally inaccurate, there will be difficulty in explaining the way in which these theories are representational (as scientists and philosophers often take them to be), with reference to isomorphism or similarity.

Suárez (2003, 235-237) has also argued that both similarity and isomorphism are each neither necessary nor sufficient for representation.  Consider first isomorphism. It must be the case that it is not necessary for representation, given that scientists often take certain theories to be representative of their real-world targets even though there is no isomorphic relationship between the theory and the target system. The same is true of similarity. Using his example, suppose that there is an artist painting an ocean view, using some blue and green paints. This painting has all sorts of similarities to the ocean view she is representing, one of which is that both the painting and the ocean are on the same relative side of the moon, are both in her line of vision at time t, share certain colors, and so forth. But which ones are relevant to its being representative and which are more contingent is up to the discretion of the agent who takes it to be representative of the ocean view in certain respects (as Giere argued). But if this is the case, then it turns out that A represents B if and only if A and B are similar in those respects in which A represents B. This ultimately leaves representation unexplained.

Supposing we can give some account of salience or attention or some other socially-based response to this first problem (which seems possible), we are left with the problem that plenty of salient similarities are non-representational. Suárez makes this point with Picasso’s Guernica (2003, 236). The bull, crying mother, eye, knife, and so forth, are all similar to certain real-world objects. But the painting is not a representation of these other things. It is representing some of the horrible atrocities of Franco.

Suárez also argues that both similarity and isomorphism are insufficient for representation. Consider the first, similarity. Take any given manufactured item, for example, an Acer C720 Chromebook, a computer which is similar to many other computers (hundreds of thousands). Notice that the fact of its similarity is insufficient to make it represent any of the other computers. Even if we add in Giere’s requirement that there be hypotheses which define the respects and degrees of the similarity, the insufficiency will remain. In fact, it seems as though there are hypotheses which define the relevant respects and degrees of similarity between the computers: Acer’s engineers and quality control have made sure that the production of these computers will result in similar computers. All the same, even with these hypotheses which give respects and degrees, we would not want to say that any given computer represents the others.

The non-sufficiency problem holds for isomorphism as well. Suppose someone were to write down some equation which had various constants and variables, and expressed certain relationships that held between the parts of the equation. Suppose now that, against all odds, this equation turns out to be isomorphic to some real-world system, say, that it describes the relationship between rising water temperatures and the reproduction rate of some species of fish which is native to mountain streams in the Colorado Rockies. To many, it appears to be counterintuitive to think that representations could happen accidentally. However, if isomorphism is sufficient for representation, then we would have to admit that the randomly composed equation does represent this fish species, even if no one ever uses or even recognizes the isomorphic relationship.

There are other arguments against these views in general, an important one being that they lack the right logical properties. Drawing on the work of Goodman (1976), both Suárez (2003, 232-233) and Roman Frigg (2006, 54) argue that representation has certain logical properties which are not shared by similarity or isomorphism. Representation is non-symmetric, so when some A represents B, it does not follow that B represents A. Representation is non-transitive: if A represents B and B represents C, it does not follow that A represents C. It’s also non-reflexive: A does not represent itself. Since isomorphism is reflexive, transitive, and symmetric, and similarity is reflexive and symmetric, they do not have the properties required to account for representation.

There are replies to these arguments on behalf of the substantive views. First, there is a general question about whether or not we are justified in making inferences from representation in art to representation in science. As was discussed above, many of the criticisms against substantive views draw examples from the domain of art (for example, Suárez’s (2003) uses many examples of paintings and is drawing upon Goodman’s (1976) which discusses representation in art). But, it should not be taken as given that what holds in art must translate to science. In fact, in many cases, the practices in art seem to be quite different from the practices in science. As Bueno and French say, “After all, what do paintings—in particular those that are given as counter-examples to our approach, which are drawn from abstract art—really have to do with scientific representation?” (2011, 879).

Following Anjan Chakravaratty (2009), Otávio Bueno and French (2011) argue that something like similarity or partial isomorphism is, in fact, necessary for successful representation in science. If there were no similarity or isomorphism at all, the successful use of models “would be nothing short of a miracle” (885). That is to say, while similarity or partial isomorphism might not be the whole story, they are at least part of the story. Using the aforementioned example of Picasso’s Guernica, they note that “there has to be some partial isomorphism between the marks on the canvass and specific objects in the world in order for our understanding of what Guernica represents to get off the ground” (885).

Replies have been made to the other arguments as well. Bueno and French (2011) argue that their account of partial isomorphism can meet all of the criticisms raised by Suárez (2003) and Frigg (2006). Adam Toon (2012) discusses some of the ways in which supporters of a similarity account of representation might respond to criticisms. Bartels (2006) defends the homomorphism account against these criticisms.

2. Deflationary and Pragmatic Accounts

If, as these scholars have argued, these substantive views will not work to explain scientific representation, what will? Suárez (2015) argues that what is needed instead is a deflationary account. A deflationary account claims “that there is no substantive property or relation at stake” (37) in debates about scientific representation. Deflationary accounts are typically marked by a couple of features. First, a deflationary account will deny that there are any necessary and sufficient conditions of scientific representation, or if there are, they will lack any explanatory value with regard to the nature of scientific representation. Second, these accounts will typically view representation as a relationship which is deeply tied to scientific practice. As Suárez puts it, “it is impossible, on a deflationary account, for the concept of representation in any area of science to be at variance with the norms that govern representational practice in that area…representation in that area, if anything at all, is nothing but that practice” (2015, 38).

Already we can see that these views will be quite different from the substantive views. Each of these views was substantive in the sense that they gave necessary and sufficient conditions for representation. There was also a distinct way in which these views were detached from scientific practice, since whether something was representational had little to do with whether or not it was accepted by scientists as representational and more to do with the features of the source and target. In each case, it was a relationship that was entirely accounted for by features of the theory or model and the target system. As Knuuttila (2005) describes it, these were all dyadic (two-place) accounts insofar as the relationship held between only two things. The deflationary accounts take a markedly different direction by moving to at least a triadic (three-place) account of representation.

In some cases, the views that have developed have followed the general lead of many deflationary views in giving a central role to the work of an agent in representation. These views do not qualify as deflationary, given that they still give necessary and sufficient conditions of representation. Given the importance of the role of agents and aims, we might call these views pragmatic. Although pragmatic and deflationary views are importantly distinct in their aims, they share many common threads and in many cases, the views could be reinterpreted as deflationary or pragmatic with little effort. As such, they will be grouped together in this section.

a. DDI

The earliest deflationary account of representation was RIG Hughes’ DDI Account (1997). The DDI Account consists of three parts: denotation, demonstration, and interpretation. Denotation is the way in which a model or theory can reference, symbolize, or otherwise act as a stand-in for the target system. The sort of denotation being invoked by Hughes is broad enough to include the denotation of concrete particulars (for example, a model of the solar system will denote particular planets), the denotation of specific types (for example, Bohr’s theory models not just this hydrogen atom, but all hydrogen atoms), and the denotation of a model of some global theory (for example, this particular model is “represented as a quantum system” (S331)). In each case, the model denotes something else; it stands in for some particular concrete object, some type of theoretical object, or some type of dynamical system.

We might think this relationship sufficient for representation, since the fact that scientists treat certain objects or parts of models as being stand-ins or symbols for some target system seems to answer the question of the relationship between a model and the world. Hughes, though, thinks that in order to understand scientific representation, we need to examine how it is actually used in scientific practice. This requires additional steps of analysis. The second part of Hughes’ DDI Account is demonstration. This is a feature by which models “contain resources which enable us to demonstrate the results we are interested in” (S332). That is, models are typically “representations-as,” meaning not only do scientists represent some target object or system, but they also represent it in a certain way with certain features made to be salient. The nature of this salience is such that it allows users to draw certain types of conclusions and make certain predictions, both novel and not. This is demonstration in the sense that the models are the vehicles through which (or in which) these insights can be drawn or demonstrated, physically, geometrically, mathematically, and so forth This requires that they be workable or used in certain ways.

The final part of the DDI Account is interpretation. It is insufficient that the models demonstrate some particular insight. The insight must be interpreted in terms of the target system. That is to say, scientists can use the models as vehicles of the demonstration, but in doing so, part of the representational process as defended in the DDI Account is that scientists interpret the demonstrated insights or results not as features of the model, but rather as features which apply to the target system (or at least, the way scientists are thinking of the target system).

In summary, with denotation, we are moving in thought from some target system to a model. We take a model or its parts to stand in or symbolize some target system or object. In demonstration, we use the model as a vehicle to come to certain insights, predictions, or results with regard to the relationship that holds internal to the model. It is in interpretation that we move from the model back to the world, taking the results or insights gained through use of the model to be about the target system or object in the world.

b. Inferential

i. Suárez

After criticizing the substantive accounts in his (2003), Suárez (2004) developed his own account of representation which focused centrally on inference and inferential capacities, what he calls an inferential conception of representation. As he describes it there, this account involves two parts. The first part is what he calls representational force. Representational force is defined as “the capacity of a source to lead a competent and informed user to a consideration of the target” (2004, 768). Representational force can exist for a number of reasons. One way to get representational force is to repeatedly use the source as a representation of the target. Another way is in virtue of intended representational uses, that is, in virtue of the intention of the creator or author of some source viewed within the context of a broader scientific community. Oftentimes, the representational force will occur as a combination of the two. It is also a contextual property, insofar as it requires that the agent using the source has the relevant contextual knowledge to be able to go from the source to the (correct/intended) target.

So, for example, in the upper left-hand corner of my word processor is a little blueish square with a smaller white square and a small dark circle inside of it (it is supposed to be an image of a floppy disk). This has representational force insofar as it allows me to go from the source (the image of the floppy disk) to the target (a means of saving the document which I am currently writing). In this case, the representational force exists in virtue of both the intended representational uses (the creators of this word processor surely intend this symbol to stand in for this activity) as well as repeated uses (I am part of a society which has, in the past, repeatedly used an image of a floppy disk to get to this target, not only in this program but in many others as well). It is also contextual: someone who had never used computers would not have the requisite knowledge to be able to use the icon correctly.

This is part of the story for Suárez, but in order to have scientific representation there must be something more than mere representational force. On his view, scientific representations are subject to a sort of objectivity which does not necessarily exist for other representations, for example, the example above of the save icon. The objectivity is not meant to indicate that there is somehow an independent representational relationship that exists in the world when scientists are engaged in scientific representation. Instead, the objectivity is present insofar as representations are constrained in various ways by the relevant features of the targets system which is being represented. That is, because there is some real feature which scientists are intentionally trying to represent in their scientific models and theories, the representation cannot be arbitrary but must respond to these relevant features. So the constraints are themselves objective, but this does not commit Suárez to identifying some reified relationship that holds between sources and targets.

According to Suárez, if we are going to get this objectivity in representations, we must turn to a second feature: the capacity of a source to allow for surrogate reasoning. This second feature requires that informed and competent agents be led to draw specific inferences regarding the target. These inferences can be the result of “any type of reasoning…as long as [the source] is the vehicle of the reasoning that leads an agent to draw inferences regarding [the target]” (2004, 773). Suárez’s point here is that not only does the source lead the agent to the target, but also it leads the agent to think about the target in a particular way, coming to particular insights and inferences with respect to the source.

More recently, Suárez (2010) has argued that this second feature, the capacity for surrogate reasoning, typically requires that three things be in place. First, the source must have internal structure such that certain relations between parts can be identified and examined. Secondly, when examining the parts of the source, scientists must do so in terms of the target’s parts. Finally, there must be a set of norms defined by the scientific practice which define and limit which inferences are “correct” or intended. It is in virtue of these norms of the practice that an agent will be able to draw the relevant and intended inferences, making the representation a part of that particular scientific practice. Of course, he takes his view to be deflationary, so these are not to be understood as necessary and sufficient conditions of the capacity for surrogate reasoning, but rather features which are frequently in place.

Consider an example of a mathematical model, for example the Lotka-Volterra equation. The model is supposed to be representational of predator-prey relationships. Part of this is Suárez’s representational force—the fact that competent agents will be lead to consider predator-prey relationships when considering the source. However, as Suárez notes, this is insufficient for scientific representation because in science the terms interact in a non-arbitrary way. To account for this, he argues that there is another feature of the model, which is the capacity to allow for surrogate reasoning. In this case, that means that individuals who examine or manipulate the model in terms of its parts (the multiple variables) will be able to draw certain inferences about the nature of real-world interactions between predators and prey (the parts of the target system). These insights will occur in part due to the nature of the model as well as the norms of scientific practice, which means that the inferences will be non-arbitrarily related to the real-world phenomena and will afford us to recognize certain specified inferences of scientific interest.

ii. Contessa

Suárez’s inferential account has been further developed by Gabriele Contessa (2007, 2011). He is explicit in his claim that the interpretational view he is defending is not a deflationary account, but is rather a substantive version of the inferential account insofar as he takes the account to give necessary and sufficient conditions of representation. All the same, the account he defends is clearly pragmatic in nature. Contessa begins by noting an important distinction he has drawn from Suárez’s work, that of the difference between three types of representation. The first is mere denotation, in which some (arbitrarily) chosen sign is taken to stand for some object. He gives the example of the logo of the London Underground denoting the actual system of trains and tracks.

The second sort of representation is what Contessa calls “epistemic representation” (2007, 52). An epistemic representation is one which allows surrogate reasoning of the sort described by Suárez. The London Underground logo does not have this feature since no one would be able to use it to figure out how to navigate. A map of the London Underground, on the other hand, would have this feature insofar as it could be used by an agent to draw these sorts of inferences.

The final sort of representation is what he calls “faithful epistemic representation” (2007, 54-55). Whether or not a representation is faithful is a matter of degree, so something will be a completely faithful epistemic representation provided all of the valid inferences which can be drawn about the target using the source as a vehicle will also be sound. Notice this does not require that a model user be able to draw every possible inference about the target, but rather that the inferences licensed by the map that are drawn will be sound inferences (both following from the source and true of the target). In this sense, a map of the London Underground produced yesterday will be more faithful than one produced in the 1930s.

Using this framework, Contessa goes on to describe a scientific model as an epistemic representation of features of particular target systems (56). The scientific model will be representational for a user when she interprets the source in terms of the target. He remains open to there being multiple sorts of interpretation which are relevant, but suggests that the most common sort of interpretation is “analytic,” which functions quite similarly to an isomorphism in which every part and relation of the source is interpreted as denoting one and only one part and relation in the target (and all of the target’s parts and relations are denoted by some part or relation from the source).

Of course, given that this is determined by the agent’s use, it is not necessary that the agent believe that her interpretation is actually the case about the system. Here is where Contessa draws on the distinction of faithfulness. Since models are often misrepresentations and idealizations as has been discussed above, they need not be completely faithful in order to be useful. This is not the end of the story, though, because the circumstances also play an important role in understanding whether or not something is a scientific representation.

c. Agent-Based Versions of Substantive Accounts

In light of some of the insights of Suárez and others, many of the views described above as substantive views were altered and updated to more explicitly and centrally make reference to the role of an agent, making them what could be called agent-centered approaches. Of most importance, given their role in the substantive views as described above are recent advances made by van Fraassen and Giere.

i. Agent-Based Isomorphism

The view of isomorphism commonly attributed to van Fraassen, which was described above, was the one drawn from his 1980 book, The Scientific Image. More recently, van Fraassen has presented an altered account of representation, which places much more emphasis on the role of an agent in his (2008). Van Fraassen notes that while some reference to an agent was a part of his earlier views (Ladyman, Bueno, Suárez, and van Fraassen, 2010), Suárez’s important work on deflationary accounts was influential in the development of the view he defends (2008, 7, 25-26; 2010).

He begins his account by looking primarily to the way in which a representation is used, saying that a source’s being representative of some target “depends largely, and sometimes only” on the way in which the source is being used (2008, 23). Though he does not take himself to be offering any substantive theory of representation, he does call this the Hauptsatz or primary claim of his account of representation: “There is no representation except in the sense that some things are used, made, or taken, to represent things thus and so” (2008, 23).   Van Fraassen notices that this places some restrictions on what can possibly be representational. Mental images are limited, because they are not made or used in some way. That is to say, we do not give our mental states representational roles. Similarly, there is no such thing as a representation produced naturally. What it is to be a representation is to be taken or used as a representation, and this is not something that happens spontaneously without the influence of an agent.

Van Fraassen also notices an important distinction in two ways of representing: representation of and representation as. When scientists take or use some source to be representational, they take it to be a representation of some target. This target can change based on context, and sometimes scientists might not even use the source to be a representation at all. Consider van Fraassen’s example: we can use a graph to represent the growth of bacterial colonies under certain conditions, and so the graph will be a representation of bacterial growth (2008, 27). But we could also use that graph to represent other phenomena, perhaps the acceleration of an object as it is dropped from some height. Part of what this captures is the way in which our perspectives can change the way in which we are representing a particular appearance. Thus, by using a source in some distinct way, we can represent some particular appearance of some particular phenomena.

In intentionally using a source as a representation, scientists do not only make it a representation of something, but they also represent it in a certain light, making certain features salient. This is what van Fraassen calls representation as. Two representations can be of the same target, but might represent that target as something different. Van Fraassen offers an example: everything that has a heart also has a kidney, but representing some organism as having a heart does not mean the same thing as representing it as has having kidneys (2008, 27). Similarly, we might represent the growth of bacteria mentioned above as an example of a certain sort of growth model or as the worsening of some infection as it is seen as part of a disease process.

Of course, all of this is very general, which van Fraassen acknowledges. However, in a true deflationary attitude, he notices that there is no good way of getting more specific about scientific representation since it has “variable polyadicity: for every such specification we add there will be another one” (2008, 29). Nonetheless, he still maintains that the link between a good or useful representation and phenomena requires a similarity in structure. As it stands, then, there is still an appeal to isomorphism present in his account: “A model can (be used to) represent a given phenomenon accurately only if it has a substructure isomorphic to that phenomenon” (2008, 309). Just as before, we have an account of representation which relies on isomorphism between the structure of the theoretical models and the (structure of) the phenomena. All the same, this is still a markedly different view from his earlier view described above. No longer is it the isomorphism or structural relationship alone which is representational. Now, on van Fraassen’s views, it is the fact that a scientific community uses it or takes it to be representational.

ii. Agent-Based Similarity

Ian Hacking (1983) has famously argued that, in philosophical discussions of the role and activity of science, too much emphasis is put on representation. Instead, he suggests that much of what is done in science is intervening, and this concept of intervention is key to understanding the reality with which science is engaged. All the same, he still thinks that science can and does represent. Representation, on his account, is a human activity which exhibits itself in a number of different styles. It is people who make representations, and typically, they do so occurs in terms of a likeness, which he takes to be a basic concept. Representation in terms of likeness, he thinks, is essential to being human, and he even speculates that it may have played a role in development like many think language did. In creating a likeness, though, he argues that there is no analyzable relation being made. Instead, “[likeness] creates the terms in a relation…First there is representation, and then there is ‘real’” (139). Representation on his view is not interested in being true or false, since the representation precedes the real.

Giere (2004, 2010) has also made pragmatics more central and explicit to his account of scientific representation. He claims that in attempting to understand representation in science we should not begin with some independent two-place relationship, which substantially exists in the world. Instead, we should begin with the activity of representing. If we are going to view this activity as a relationship, it will have more than two places. He proposes a four-place relation: “S uses X to represent W for purposes P” (2004, 743). Here, S will be some agent broadly construed, such that it could be some individual scientist, or less specifically some group of scientists. X is any representational object, including models, graphs, words, photographs, computational models, and theories. W is some aspect or feature of the world and P are the aims and goals of the representational activity; that is, the reasons why the scientist is using the source to represent the target. Giere identifies a number of different potential purposes of representation. These include things like learning what something is actually like, but are fairly contextual and depend upon the question being asked. So the way in which something is modeled might change depending on the purposes of the representation (2004, 749-750).

Giere is still working from what should be considered a semantic conception of theories, in which a theory is a set of models which are created according to a set of principles and certain specific conditions. The principles are what we might otherwise think of as being empirical laws, but he does not conceive of them as having empirical truth. Instead, by thinking of them as principles by which scientists can form models, it is these scientists who construct and use the models who make particular the otherwise general and idealized principles. On this view, then, it is the models which are representational and will link up to the empirical world.

There are many ways a scientist can use a model to represent the world, on Giere’s view, but the most important way remains similarity. Giere is quick to note that this does not mean that we need to think of the representational relationship as some objective or substantive relationship in the world. Instead, the scientist who uses the model does the representing and she will often do this in virtue of picking out certain salient features of a model which are similar to the target system. In doing so, the scientist specifies the relevant aspects and degrees of similarity which she is using in her act of representation.

One of the advantages of this updated version of the similarity view is the wide range of models which can be effectively representational on this account (Giere 2010). Giere gives an example of a time when he saw a nuclear physicist treat a pencil as a model of a beam of protons, explaining how the beam could be polarized. It is in virtue of the invoked similarity between the pencil and a beam of photons, that is, the fact that the physicist specifically used a relevant similarity, that he was able to use it to represent the beam of photons. By noting the importance of the role of the agent, Giere is better able to explain the scientific representation which occurs in the whole range of scientific representations.

A similar yet importantly distinct account of representation as similarity is defended by Paul Teller (2001). Teller argues that we should abandon what he called the perfect model model, in which we take scientists to model in a way that is perfectly correspondent with the real world targets. Instead, he thinks that models are rarely, if ever, perfect matches for their targets. This does not mean that models are not representations. He argues that models represent their targets in virtue of similarity, though he denies that any general account of similarity can be given. What makes something a similarity depends deeply upon the circumstances at hand including the interests of the model user.

d. Gricean

One way to ‘deflate’ the problem of scientific representation is to claim that there is no special problem for scientific representation, and instead argue that we should understand the question of scientific representation as part of the already widely discussed literature on representation in general. This is the project taken up by Craig Callender and Jonathan Cohen (2006). According to their view, representation in many different fields (art, science, language, and so forth) can be explained by more fundamental representations, which are common to each of the fields.

To explain this, they appeal to what they call “General Griceanism”, which takes its general framework from the insights of Paul Grice. On their General Gricean view, the representational nature of scientific objects will be explained in terms of something more fundamentally representational. The more fundamentally representational objects in this case are mental states. This, in effect, pushes the hard philosophical problem back a stage, since some account must be given with regard to the representational nature of mental states. They remain uncommitted to any particular account of the representational nature of mental states, leaving that something to be argued about in philosophy of mind. All the same, they mention a few popular candidates: functional role theories, informational theories, and teleological theories.

There are, on their view, significant advantages to taking this General Gricean viewpoint. For one, it has a certain sort of simplicity to it. By explaining all representation in terms of the fundamental representations of mental states, we do not need to give wildly different explanations as to why a scientific model represents its target and why, for example, a green light represents ‘go’ to a driver. Each occurs because, in virtue of what the scientist or “hearer” knows, a certain mental state will be activated which contains with it the relevant representational content.

They can also explain the reasons why similarity or isomorphism will be commonly used (though non-necessary) since these are strong pragmatic tools in helping to better bring about the relevant mental state with its representational content. This is, as they argue, clearly one of the reasons why people from Michigan use an upturned left hand to help them explain the relative location of their hometown–because the upturned left hand is similar in shape to the shape of Michigan. The reasons why similarity is a useful tool here are identical to the reasons why similarity would be useful in scientific contexts, because it will make the relative instance of communication more effective–meaning that the hearer (or user of a model or scientific representation) will be better able to arrive at the relevant mental states which represent the target system.

In short, their view is that while there might be a general philosophical problem of representation, there is not anything special about scientific practice that makes its stake in this problem any different from any other field or the general problem. Of course, as they amusingly note, this passes the buck to the more fundamental question: “Once one has paid the admittedly hefty one-time fee of supplying a metaphysics of representation for mental states, further instances of representation become extremely cheap” (71).

e. Critiques of Deflationary and Pragmatic Accounts

These deflationary and pragmatic accounts of representation have not avoided criticisms of their own. Many of these criticisms are presented as part of the defense of one of the views over another. For example, Contessa (2011) argues against a purely denotational account of scientific representation, such as the one seen in Callendar and Cohen’s (2006). As he says, “Whereas denotation seems to be a necessary condition for epistemic representation, it does not, however, seem to be a sufficient condition” (2011, 125). As Contessa argues, it is insufficient merely to be able to stipulate a denotational relationship to have the sort of representation which is useful to scientists. For example, we might use any given equation (for example, F=ma) to denote the relationship which holds between the size of predator and prey populations. But, while this equation could successfully denote this relationship, it will not be of much use to scientists because they will not be able to draw many insights about the predator-prey relationship. Therefore, Contessa argues, while denotation is a necessary condition of representation, it cannot alone be the whole story. In addition, he suggests the need for interpretation in terms of the target, as described above.

Matthias Frisch (2015, 296-304) has raised a worry which he addressed specifically to van Fraassen’s (2008) account, but which is applicable to many of the pragmatic and deflationary accounts described above. The worry is that if we take van Fraassen’s Hauptsatz (“There is no representation except in the sense that some things are used, made, or taken, to represent things thus and so” (2008, 23)) literally, then it seems to be impossible that some models can represent. Taking Frisch’s example, say we wanted to construct a quantum mechanical model of a macroscopic body of water. To do this, “we would have to solve the Schrödinger equation for on the order of 1025 variables—something that is simply impossible to do in practice” (297). But if this is so, then it turns out that that the Schrödinger equation cannot be used to represent a macroscopic body of water—since we could never use the equation in this way, it is not representational in this way. Notice that this concern applies to other pragmatic and deflationary accounts: if we are unable to make inferences or interpret the source in terms of the target (which, given the complexity here, it seems we would not be able to do), then it will also fail to be representational on these other accounts. But this leads to a fairly strong conclusion that we can only use a model to represent a system once we have actually applied the model to that system. For example, the Lotka-Volterra model seems to only represent those systems for which scientists have used it; it does not represent all predator-prey relationships, in general.

Frisch (2015, 301-304) does not think that this argument is ultimately fatal to the pragmatic accounts since he argues that because there are constraints on the use of models which are part of the scientific practice, there is a sense in which the Lotka-Volterra model, for example, represents all predator-prey relationships (even though it has not yet been used in this way). There is no problem in extending models “horizontally,” that is, to other instances which are in the same domain of validity of the model. There is, Frisch argues, a problem in extending models “vertically,” that is, using a model to represent some phenomena which is outside the domain of validity. This can be seen in the quantum mechanics example from above since we do not have any practice in place to use Schrödinger equations to describe macroscopic bodies of water. So, he claims, van Fraassen’s view (and, by extension, the other pragmatic and deflationary views) must be committed to an anti-foundationalism (a view that the sciences cannot be reduced to one foundational theory) that denies that the models of quantum mechanics can adequately represent macroscopic phenomena. Of course, the anti-foundationalist commitments might be viewed as a desirable feature of these views, rather than a flaw, depending upon other commitments.

Another important critique which applies more generically to a number of these deflationary and pragmatic views comes from Chakravartty (2009). As described above, many of those who argue for a deflationary or pragmatic account of representation offer their view as an alternative to the substantive accounts. That is to say, they deny that scientific representation is adequately described by the substantive accounts and do not merely add to these accounts, but rather reject them and offer their deflationary or pragmatic account instead. Chakravartty argues that this is a mistaken move. We should not think of deflationary or pragmatic accounts as alternatives to the substantive accounts, but rather as compliments. On the deflationary or pragmatic accounts, representation occurs when inferences can be made about the target in virtue of the source. But, “how, one might wonder, could such practices be facilitated successfully, were it not for some sort of similarity between the representation and the thing it represents—is it a miracle?” (201). That is to say, the very function which proponents of the deflationary or pragmatic accounts take to be the central explainer of scientific representation seems to require some sort of similarity or isomorphism (Bueno and French 2011). On Chakravartty’s view, the pragmatic or deflationary accounts go too far in eliminating the role for some substantive feature. In doing so, they leave an important part of scientific representation behind.

3. Model-Based Representation

The question of scientific representation has received important attention in the context of scientific modeling. There is a vast literature on models, and much of it is at least tangentially related to the questions of representation. An examination of this literature provides an opportunity to see other sorts of insights with regard to representation and the relationship between the world and representational objects.

a. Models as Representations (and More)

Much of the literature on models focuses on the various roles of models within scientific practice, both representational and others. In an influential volume on models, Margaret Morrison and Mary Morgan (1999) use a number of examples of models to defend the view that models are partially independent from theories and data and function as instruments of scientific investigation. We can learn from models due to their representational features. Morrison and Morgan start out by focusing on the construction of models. Models, on their account, are constructed by combining and mixing a range of disparate elements. Some of the elements will be theoretical and some will be empirical, that is, from the data or phenomena. Thus far, this view is mostly in line with what has been discussed in the above sections. What makes models unique in their construction is that they often involve other outside elements. These can be stories (ways of explaining some unexpected data which are not part of a theory), other times it is a sort of structure which is imposed onto the data. These other elements, they argue, give models a sort of partial independence or autonomy. This is true even when the outside elements are not as obviously present, for example when a model is an idealized, simplified, or approximated version of a theory. This independence is crucial if we are to use them to help understand both theories and data as we often use them to do.

According to Morrison and Morgan, models function like tools or instruments for a number of purposes. There are three main classifications of the uses of models. The first is in interacting with theories: models can be used to explore a theory or to make usable a theory which is otherwise unusable. They can also be used to help understand and explore areas for which we do not yet have a theory. Other times, the models are themselves the objects of experimentation. The second classification of the use of models is in measurement: not only as a way of structuring and presenting measurements, but they can also function directly as instruments of measurement. Finally, models are useful when designing and creating technology.

Models are not valuable only insofar as they have these functions. Models, Morrison and Morgan argue, are also importantly representational. Their representational value relies in part on the way in which they are constructed with both the theory and the data or phenomena. Models can represent theories, data, or can be representational instruments which mediate between data and theory. Whatever the case, representation, on their view, is not taken to be some mirroring or direct correspondence between the model and its representational target. Instead, “a representation is seen as a kind of rendering–a partial representation that either abstracts from, or translates into another form, the real nature of the system or theory, or one that is capable of embodying on a portion of a system” (1999, 27). Sometimes models can be used to represent non-existent or otherwise inaccessible theories, as they claim is the case with simulations.

The final role of models described by Morrison and Morgan is the way that models afford the possibility of learning. Sometimes the learning comes in the construction of the model. Most frequently, though, we can learn using models by using and manipulating them. In doing so, we can learn because the models have the other features already described: the wide range of sources for construction, the functions, and their status as representations. Oftentimes, the learning takes place internal to the model. In these cases, the model serves as what they call a representative rather than a representation. With representatives, the insights we can gain from manipulating the model are all about the model itself. But in doing so, we come to a place from which we can better understand other systems, both real-world systems and other systems. Other times, we take the world into the model and then manipulate the world inside the model, as a sort of experiment.

Daniela Bailer-Jones (2003, 2009) defends a slightly different but related account of the representational nature of models. On her account, models entail certain propositions about the target of the model. As propositions, they are subject to being true or false. One way of thinking about the representation of models is to say that models are representational insofar as their entailed propositions are true. However, this cannot be exactly right, since, as was mentioned above, models oftentimes intentionally entail false propositions. Since models are about those aspects of a phenomenon which are selected, they will fail to say things about other aspects of a phenomenon. In some cases, the propositions entailed may be true for one aspect but false for another. This calls for the role of model users who decide what function the model has, the ways in which and degree to which the model can be inaccurate, and which aspects of the phenomenon are actually representing. In sum, on her view, models are representational in part due to their entailed propositions, but also due to the role of the model users.

Tarja Knuuttila (2005, 2011) has argued that in thinking about models too much emphasis has been placed on their representational features – even in accounting for their epistemic value. Following and expanding on Morrison and Morgan (1999), she argues that we should think of models as being material epistemic artefacts, that is, “intentionally constructed things that are materialized in some medium and used in our epistemic endeavors in a multitude of ways” (2005, 1266). The key to their epistemic functioning is to be found from their constrained and experimentable nature. Models, according to this account, are constrained by their construction in such a way that they make certain scientific problems more accessible, and amenable to a systematic treatment. This is one of the main roles of idealizations, simplifications, and approximations. On the other hand, the representational means used also impose their own constraints on modeling. The representational modes and media through which models are constructed (for example, diagrams, pictures, scale models, symbols, language) all afford and limit scientific reasoning in their different ways. When considered in this respect, Knuuttila argues, we can see that models have far more than mere representational capacities including that they are themselves the targets of experimentation and can be thought of as creating a sort of conceptual parallel reality.

b. Model-Building

In addressing model-building Weisberg (2007b) and Godfrey-Smith (2006) both take up the idea that the characteristic ways in which models are constructed is indirect. This comes about in a three step process in which a scientist first constructs a model, then analyzes and refines the model, and finally examines the relationship between the model and the world (Weisberg 2007b, 209). Models are used and understood by scientists with “construals” of the model (Godfrey-Smith 2006). The construal, on Weisberg’s account, is made of four parts. The first is an assignment which identifies various parts of the model to the phenomena being investigated. The second part of a construal is the scope, which tells us which aspects of the phenomena are being modeled. The final two parts of the construal are each fidelity criteria. One of these is the dynamical fidelity criteria, which identifies a sort of error tolerance of the predictions of the model. The other is the representational fidelity criteria, which give standards for understanding whether the model gives the right predictions for the right reasons, that is, whether or not the model is linking up to the causal structure which explains the aspects of the phenomenon being modeled.

This strategy of model-based science is contrasted with a different sort of strategy, what Weisberg calls abstract direct representation. Abstract direct representation is the strategy of science in which study of the world is unmediated by models. He gives the example of Mendelev’s development of the periodic table of elements. This process did not begin with a hypothetical abstract model which is refined and then used representationally (as Weisberg thinks the process of model-based science proceeds). Instead, this process starts with the phenomena and abstracts away to more general features. Such distinction between modelling and abstract direct representation underlines the possibility that not all scientific representations need to achieved in same ways.

There are some worries about Weisberg’s understanding of the process of model-making (Knuuttila and Loettgers, in press). Through a close examination of the development of the Lotka-Volterra model, Knuuttila and Loettgers argue that the process of model-building often begins with certain sorts of templates, or characteristic ways of modeling some phenomena, typically adopted from other fields. Such already familiar modeling methods and forms offer the modeler a sort of scaffolding upon which they can imagine and describe the target system. They also argue that another distinct feature of model-making is its outcome-orientation. That is, in developing a model, a scientist will typically do so with an eye to the anticipated insights or features of the target system that they wish to represent. Thus, on their view, the modeler pays close attention to the target system or empirical questions in all stages of the development of the model (not just at the end, as Weisberg suggests).

c. Idealization

One of the important discussions that has developed primarily in the literature on models concerns idealization. Weisberg (2007a) argues that there are three different kinds of idealization, which he generically describes as “the intentional introduction of distortion into scientific theories” (639). The first kind of idealization he calls Galilean idealization. This is the sort of idealization in which a theory or model is intentionally distorted so as to make the theory or model simpler, in order to render it computationally tractable. This sort of idealization occurs when scientists ignore certain features of a system or theory, not because they are playing no role in what actually happens, but rather because including them makes the application of the theory or model so complex that they cannot gain traction on the problem. By removing these complexities, scientists distort their model (because it lacks complexities which reflect the target). But after gaining some initial computational tractability, they can slowly reintroduce the complexities and thus remove the distortions.

The second type of idealization is what Weisberg calls minimalist idealization. In a minimalist idealization, the only features that are carried into the model or theory are those causal features which make a difference to the outcomes. So, if some feature of a target can be left behind without losing predictive power, a minimalist idealization will leave that feature behind. As an example, Weisberg notes that when explaining Boyle’s law, it is often assumed that there are no collisions between gas molecules. This is, in fact, false since collisions between gas molecules are known to take place in low-pressure gasses. But, “low pressure gases behave as if there were no collisions” (2007a, 643). So, since these collisions do not make any difference to our understanding of this system, scientists can (and do) leave this fact behind.

Notice that this is distinct from Galilean idealization insofar as minimalist idealizations leave certain features out of their theories or models because they make no difference to the relevant tasks or goals at hand. Galilean idealization, on the other hand, leaves certain features out even when they do make a difference, simply because leaving them in would make the model more complex and less tractable.

The final sort of idealization described by Weisberg is what he calls multiple-models idealization. This is the practice of using a number of different, often incompatible models to represent or understand some phenomenon. In this case, none of the models by itself is capable of accurately modeling the relevant target system. All the same, each of the models is good at representing certain features of the target system. Thus, by using not just a single model but rather this group of models, each of which is distorted, scientists can get a better sense of the target system. Weisberg offers a helpful example: the National Weather Service uses a number of different models in making its weather forecasts. Each of the models used represents the target in a different way, each being inaccurate in some way or another. It is the use of all of these models that permits forecasts of higher accuracy since attempts to make a single model have resulted in less accurate predictions.

4. Sociology of Science

a. Representation and Scientific Practice

Many important insights on the nature of scientific representation have come not from the philosophy of science but rather from thinkers who would typically be considered part of the field of sociology of science. The insights from this field can serve as both a source of insight on the nature of representation in scientific practice, as well as a challenge to the primarily epistemically-oriented insights from the philosophy of science. Michael Lynch and Steve Woolgar (1990) edited an important collection of papers on scientific representation in practice written from the perspective of sociology of science called Representation in Scientific Practice. More recently, (2014), Lynch and Woolgar edited another collection with Catelijne Coopmans and Janet Vertesi. Treating representation from the perspective of sociology of science involves asking a different sort of question than the one so far addressed in this article. Instead of asking about the constitution of scientific representation, sociologists of science are more interested in a different question, “What do the participants, in this case, treat as representation?” (Lynch and Woolgar 1990, 11).  In the introduction to this volume, Lynch and Woolgar provide a general overview of some of the important insights from this perspective.

Since sociology of science treats scientific practice as its object of inquiry, it is keen to describe precisely how representations are actually used by scientists. They note the importance of “the heterogeneity of representational order” (2). That is, there is a wide range of devices which are representational as well as a wide range of ways in which the representations are used and in which they are useful. Importantly, sociologists are often interested in discussing more than merely the epistemic or informational role and use of representations, viewing them as significantly social, contextualized, and otherwise embedded in a complex set of activities and practices. Sociologists of science attempt to pay attention to the whole gamut of representations and representational uses to better understand precisely the role they play within scientific investigation.

Another important insight, Lynch and Woolgar note, is that the relation between representations is not to be thought of as directional in the sense that the representations move from or towards some “originary reality” (8). Instead, any directionality of representations is to be thought of as “movement of an assembly line” (8). That is to say, representational practice must be seen as constructing not only a representation, but also (re)constructing a phenomenon in a way so that it can be represented. This is something that can be seen in much of the literature from sociologists of science, including the work of Latour (see below).

In paying close attention to the way representations are actually used, some sociologists of science note settings in which “discrepancies between representations of practice and the practices using (and composing) such representations” (9). These discrepancies and other problems encountered in the actual practice of science allow for improvisation and creativity which can help advance the particular domains of which they are a part. Sociologists of science are interested in studying this creativity, not only for its productivity in science, but also as an interesting phenomenon in its own right.

b. Circulating Reference

A particularly telling example of these insights, especially from the philosophical point of view, is provided in Bruno Latour’s “Circulating Reference” (Latour 1999). Latour’s photo-philosophical case study is based on the work of a group of scientists who were examining the relationship between a savannah and forest ecosystem. At the end of their project, they collectively published a paper on their findings which included a figure of the interaction between the ecosystems, detailing the change in soil composition among other features. Latour asks how it is that this abstract drawing, which takes a perspective no individual could possibly have had and which ignores so many of the features of the ecosystem, can be about that stretch of land. That is to say, here we have a drawing, something made by ink and paper, and there we have the forest-savannah ecosystem, how is it that the former can be about the latter?

Latour’s method of answer, which takes the form of a strikingly well-written case study in which he presents pictures from the expedition which he uses to structure and represent the process he describes, is to look carefully at all the details and steps by which the scientists got from the expedition to the figure in the paper. What happens, says Latour, is that there are a series of steps through which the scientists abstract from the world in some intentional fashion. In doing so, they maintain some relevant feature of the world, but they are simultaneously constructing the phenomena they are studying. In the process the representations produced are also getting more abstract.

An example will make this clearer. At one stage in the process, soil samples are collected from a vertical stretch of ground. These samples are transferred into a device which allows the whole vertical stretch of earth to be viewed synoptically. In taking the sample, the scientist has already begun to construct–already this particular bit of dirt is taken to be representative of the dirt for a much wider area of land. Once the soil has been collected, various features of the soil are maintained through intentional actions on the part of the scientist. For example, the scientists will label the soil as being of a certain sort of consistency. The scientists then use a clever device in which there are pinholes in a tool which has the various Munsell colors and numbers, which is itself a construction with a long history. In looking through the pinholes, the scientist can abstract away from the dirt sample itself, taking, in some sense, only the color (which is done in virtue of a construction of numbers associated with particular colors). Something has clearly been lost, namely, the full materiality of the dirt. But something has also been gained, in this case, a number which corresponds to the color of the dirt; some usable, manipulable data.

Latour’s essay carefully describes many of these transitions from the savannah-forest system to the published figure. As he claims, it is this series of transitions (which each involve abstraction and construction due to the intentional decisions of a scientist) which ensures that the figure at the end references or represents the savannah-forest system. There is not a single gap between the figure and the world which must be accounted for by some representational relation. Instead, on his account, there is a large series of gaps, each of which is crossed by a scientist’s actions in abstracting and maintaining, constructing and discovering. This series, he thinks, can be extended infinitely in either direction. By abstracting further from the already quite-abstract figure, certain hypotheses might be suggested, which would result in a return to the savannah-forest system, to gather data which might be more basic than the data already gathered. On his view, there is no such thing as “the world” which is the most basic thing-in-itself. Nor is there any most-abstracted element.

c. Critiques of Sociology of Science

While these insights from the sociology of science literature have been both sources of support and criticism for the philosophical literature, they have also been subject to criticisms. One important criticism comes from Giere’s (1994) review of Lynch and Woolgar’s (1990) Representation in Scientific Practice. Giere’s primary target is the extremely constructivist nature of the sociology of science literature. The constructivist approach claims that science is socially constructed: that is, science is filled with socially-dependent knowledge and aimed at understanding socially-constructed objects. There is no such thing, on this view, as a non-constructed world to be understood by scientists, and therefore, no such world to be represented. The attempt to explain representation in this framework results in a “no representation theory of representation” (Giere 1994, 115). But, Giere thinks there is a straightforward counter-slogan to a view of this sort: “no representation without representation” (115). That is to say that if there is nothing ‘out there’ in the world being represented, it cannot be that this is an instance of representation. This is not to reject the importance of paying attention to the role of the practices and representational devices in particular case studies. All the same, Giere argues that if we want a general account of scientific representation, “we must also go beyond the historical cases” (119). Put otherwise, the sociology of science perspective is an important part of explaining scientific representation, but this work by itself leaves representation unexplained.

Knuuttila (2014) takes up a similar line of criticism. While she places great importance on the insights of sociologists of science, she thinks that many of their views have developed with a false target in mind. Many sociologists of science place their views as a contrast to a traditional philosophical view of science as something which perfectly represents the world. The alternative, they suggest, is their constructivist approach, as described above. However, Knuuttila argues, this motivation runs into problems. First, when they select certain practices to investigate rather than others, by what criterion are they distinguishing this practice as representational? In doing so, they seem to be relying on some traditional account of representation to delineate the cases of interest. Further, it seems that these studies do not show that representation is a defunct concept and that we are bound to a purely constructivist account of science. Instead, “these cases actually reveal…what a complicated phenomenon scientific representation is…and give us clues as to how, through the laborious art of representing, scientists are seeking and gaining new knowledge” (Knuuttila 2014, 304). We need not think that just because there is no perfect representation of the world, that there is therefore no world to be represented. Their insights could equally contribute to an intermediate view in which we reject this perfect-representation view of science, but still maintain that science is giving us knowledge of the real world. That is, we can simultaneously deny that representations “are some kind of transparent imprints of reality with a single determinable relationship to their targets” while still affirming that the “artificial features of scientific representations…result from well-motivated epistemic strategies that in fact enable scientists to know more about their objects” (Knuuttila 2014, 304).

5. References and Further Reading

  • Bailer-Jones, D. (2003). When Scientific Models Represent. International Studies in the Philosophy of Science 17: 59-74.
    • Representation in models is linked to entailed propositions and pragmatics.
  • Bailer-Jones, D. (2009). Scientific Models in Philosophy of Science. Pittsburgh: University of Pittsburgh Press.
    • Extended discussion of the history of philosophy of models and defends an account of models.
  • Bartels, A. (2006). Defending the Structural Concept of Representation. Theoria 55: 7-19.
    • Homomorphism as an account of representation.
  • Brading, K. and E. Landry. (2006). Scientific Structuralism: Presentation and Representation. Philosophy of Science 73: 571-581.
    • Structuralism as a strong methodological approach in science.
  • Bueno, O. (1997). Empirical Adequacy: A Partial Structures Approach. Studies in History and Philosophy of Science 28: 585-610.
    • Representation as partial isomorphism.
  • Bueno, O. and S. French. (2011). How Theories Represent. British Journal for the Philosophy of Science 62: 857-894.
    • Representation as partial isomorphism; replies to arguments against partial isomorphism.
  • Callender, C. and C. Cohen. (2006). There Is No Special Problem About Scientific Representation. Theoria 21: 67-85.
    • Scientific representation is to be explained like other problems of representation.
  • Cartwright, N. (1983). How the Laws of Physics Lie. New York: Oxford University Press.
    • Distortion and idealization in the laws of physics.
  • Cartwright, N., T. Shomar, and M. Suárez. (1995). The Tool Box of Science: Tools for the Building of Models with a Superconductivity Example. Poznan Studies in the Philosophy of the Sciences and the Humanities 44: 137-149.
    • Models are not only theory-driven, but also phenomena-driven.
  • Chakravartty, A. (2009). Informational versus Function Theories of Scientific Representation. Synthese 72:197-213.
    • Argues that substantive and pragmatic accounts are complementary.
  • Contessa, Gabriele. (2007). Scientific Representation, Interpretation, and Surrogative Reasoning. Philosophy of Science 74: 48-68.
    • A substantive inferential account of representation.
  • Contessa, Gabriele. (2011). Scientific Models and Representation. In S. French and J. Saatsi (eds.) The Bloomsbury Companion to the Philosophy of Science, pp. 120-137. New York: Bloomsbury Academic.
    • A substantive inferential account of representation.
  • Coopmans, C. J. Vertesi, M. Lynch, S. Woolgar (eds.). (2014) Representation in Scientific Practice Revisited. Cambridge: MIT Press.
    • Reexamines the question of representation in scientific practice in contemporary sociology of science.
  • da Costa, N. and S. French. (2003). Science and Partial Truth: A Unitary Approach to Models and Scientific Reasoning. Oxford: Oxford University Press.
    • Representation as partial isomorphism.
  • French, S. (2003). A Model-Theoretic Account of Representation (Or, I Don’t Know Much about Art… but I Know It Involves Isomorphism). Philosophy of Science 70: 1472-1483.
    • Representation as partial isomorphism.
  • French, S. and J. Ladyman. (1999). Reinflating the Semantic Approach. International Studies in the Philosophy of Science 13: 103-119.
    • Representation as partial isomorphism.
  • Frigg, R. (2006). Scientific Representation and the Semantic View of Theories. Theoria 55: 49-65.
    • Criticisms of the semantic approaches to representation.
  • Frisch, M. (2015). Users, Structures, and Representation. British Journal for the Philosophy of Science 66: 285-306.
    • Discusses van Fraassen’s (2008); presents and responds to a criticism.
  • Giere, R. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago Press.
    • A semantic account of theories and representation as similarity.
  • Giere, R. (1994). No Representation without Representation. Biology and Philosophy 9: 113-120.
    • Argues that an appeal only to practice will leave scientific representation unexplained.
  • Giere, R. (2004). How Models Are Used to Represent Reality. Philosophy of Science 71: 742-752.
    • Representation as similarity with input of agents.
  • Giere, R. (2010). An Agent-Based Conception of Models and Scientific Representation. Synthese 172: 269-281.
    • Representation as similarity with input of agents.
  • Godfrey-Smith, P. (2006). The Strategy of Model-Based Science. Biology and Philosophy 21: 725-740.
    • Discusses the use of models as being classified by a particular strategy in science.
  • Goodman, N. (1976). Languages of Art. Indianapolis: Hackett.
    • Argues for an account of representation in art which has been influential to accounts of scientific representation.
  • Hacking, I. (1983). Representing and Intervening. New York: Cambridge University Press.
    • Representation in terms of likeness, argues for the importance of intervention.
  • Hesse, M. (1966). Models and Analogies in Science. Notre Dame, Indiana: University of Notre Dame Press.
    • Argues that models are central to scientific practice.
  • Hughes, R.I.G. (1997). Models and Representation. Philosophy of Science 64: S325-S336.
    • The DDI account of representation.
  • Knuuttila, T. (2005). Models, Representation, and Mediation. Philosophy of Science 72: 1260-1271.
    • Models as epistemic tools.
  • Knuuttila, T. (2011). Modelling and Representing: An Artefactual Approach to Model-Based Representation. Studies in the History and Philosophy of Science 42: 262-271.
    • Relates representation to representationalism, and expands the notion of models as epistemic tools.
  • Knuuttila, T. (2014). Reflexivity, Representation, and the Possibility of Constructivist Realism., In M. C. Galavotti, S. Hartmann, M. Weber, W. Gonzalez, D. Dieks, and T. Uebel (eds.) New Directions in the Philosophy of Science, pp. 297-312. Dordrecht, Netherlands: Springer.
    • Criticizes sociology of science accounts and argues that they are compatible with philosophical accounts.
  • Knuuttila, T. and A. Loettgers, A. (in press). Modelling as Indirect Representation? The Lotka-Volterra Model Model Revisited. British Journal for the Philosophy of Science.
    • Focuses on the interdisciplinary, historical and empirical aspects of model construction.
  • Latour, B. (1999). Circulating Reference. In Pandora’s Hope. Cambridge: Harvard University Press.
    • Traces and discusses the steps from some phenomena to a representation.
  • Ladyman, J., O. Bueno, M. Suárez, B.C. van Fraassen. (2011). Scientific Representation: A Long Journey from Pragmatics to Pragmatics. Metascience 20: 417-442.
    • Discussion of van Fraassen’s (2008), with replies by van Fraassen.
  • Lynch, M. and S. Woolgar (eds.). (1990). Representation in Scientific Practice. Cambridge: MIT Press.
    • Collection of essays on scientific representation by sociologists of science.
  • Morgan, M. and M. Morrison (eds.). (1999). Models as Mediators: Perspectives on Natural and Social Science. New York: Cambridge University Press.
    • Collection of essays on the uses and nature of models.
  • Suárez, M. (2003). Scientific Representation: Against Similarity and Isomorphism. International Studies in the Philosophy of Science 17: 225-244.
    • Criticisms of similarity and isomorphism.
  • Suárez, M. (2004). An Inferential Conception of Scientific Representation. Philosophy of Science 71: 767-779.
    • Inferential account of representation.
  • Suárez, M. (2010). Scientific Representation. Philosophy Compass 5: 91-101.
    • Gives a brief overview of accounts of representation.
  • Suárez, M. (2015). Deflationary Representation, Inference, and Practice. Studies in History and Philosophy of Science 49: 36-47.
    • Discusses deflationary accounts and argues that both his inferential account and the DDI account are deflationary.
  • Suppe, F. (1974). The Structure of Scientific Theories. Urbana: University of Illinois Press.
    • Criticizes the syntactic view and introduces a semantic conception of theories.
  • Teller, P. (2001). Twilight of the Perfect Model Model. Erkenntnis 55: 393-415.
    • Argues for a deflationary account of similarity as representation.
  • Toon, A. (2012). Similarity and Scientific Representation. International Studies in the Philosophy of Science 26: 241-257.
    • Explores responses to criticisms on behalf of similarity.
  • van Fraassen, B. C. (1980). The Scientific Image. New York: Oxford University Press.
    • Representation as isomorphism (among other things).
  • van Fraassen, B. C. (2008). Scientific Representation: Paradoxes of Perspective. New York: Oxford University Press.
    • Representation as isomorphism with important role for agents (among other things).
  • Weisberg, M. (2007a). Three Kinds of Idealization. Journal of Philosophy 104: 639-659.
    • Three different sorts of idealization.
  • Weisberg, M. (2007b). Who Is a Modeler? British Journal for the Philosophy of Science 58: 207-233.
    • Models are indirect representations, a strategy which is distinct from abstract direct representation.
  • Weisberg, M. (2013). Simulation and Similarity: Using Models to Understand the World. New York: Oxford University Press.
    • Representation as similarity.
  • Winther, R. (2015). The Structure of Scientific Theories. In E.N. Zalta (ed.), Stanford Encyclopedia of Philosophy. (Spring 2015 Edition). http://plato.stanford.edu/
    • Detailed discussion of accounts of the structure and representation of scientific theories with extensive bibliography.

 

Author Information

Brandon Boesch
Email: boeschb@email.sc.edu
University of South Carolina
U. S. A.

Olympe de Gouges (1748—1793)

Olympe de Gouges“Woman has the right to mount the scaffold; she must equally have the right to mount the rostrum” wrote Olympe de Gouges in 1791 in the best known of her writings The Rights of Woman (often referenced as The Declaration of the Rights of Woman and the Female Citizen), two years before she would be the third woman beheaded during France’s Reign of Terror. The only woman executed for her political writings during the French Revolution, she refused to toe the revolutionary party line in France that was calling for Louis XVI’s death (particularly evident in her pamphlet Les Trois Urnes, ou le Salut de la Patrie [The Three Ballot Boxes, or the Welfare of the Nation, 1793).  Simone de Beauvoir recognizes her, in Le Deuxième sexe [The Second Sex] (1949 [1953]), as one of the few women in history who “protested against their harsh destiny.” Favorably described by commentators alternately as a stateswoman, a femme philosophe, an artist, a political analyst, and an activist, she can be considered all of these but not without some qualification. While contradictions abound in her writings, she never wavered in her belief in the right to free speech and in its role in social and political critique.

On the death of her husband after a brief unhappy marriage, she moved to Paris, where, with the support of a wealthy admirer, she began a life’s work focused wholly on mounting the rostrum denied to women. Defying social convention in every direction and molding a life evocative of feminism of a much later age, Gouges spent her adult life advocating for victims of unjust systems, helping to create a public conversation on women’s rights and the economically disadvantaged, and attempting to bring taboo social issues to the theatrical stage and to the larger social discourse. There is disagreement on whether she participated in or even occasionally hosted some of the literary and philosophic salons of the day. Her biographer Oliver Blanc suggests yes; historian John R. Cole (2011) turns up no evidence. Still, she was recognized among the fashionable and intellectual elite in Paris, and she was well-versed in the main themes of the most influential thinkers of her day, at least for a time. Her name appears in the Almanach des addresses (a kind of social registry) from 1774-1784. Despite harsh criticism for using her voice in the political arena and thus challenging deeply entrenched gender norms (“having forgotten the virtues that belong to her sex,” wrote Pierre Gaspard Chaumette, then-President of the Paris Commune, warning other politically active women of Gouges’s fate), she was certainly the author of 40 plays (12 survive), two novels and close to 70 political pamphlets.

While not a philosopher in any strict sense, she deserves attention for her morally astute analysis of women’s condition in society, for her re-imagining of the intersection of gender and political engagement, for her conception of civic virtue and her pacifist stance, and for her advocacy of selfhood for women, blacks, and children (especially in their right to know their origins). She was among the first to demand the emancipation of slaves, the rights of women (including divorce and unwed motherhood), and the protection of orphans, the poor, the unemployed, the aged and the illegitimate. She had a talent for emulating those she admired, including especially Rousseau but also Condorcet, Voltaire, and the playwright Beaumarchais.

Table of Contents

  1. Early Life
  2. Intellectual Pursuits
    1. Literary
    2. Political
    3. The Rights of Woman (1791)
    4. Philosophical
      1. Gouges and Rousseau
  3. Relevance and Legacy
  4. References and Further Reading
    1. Extant Works by Olympe de Gouges (in French)
    2. On-line English Translations of Gouges’s Original Works
    3. Secondary Sources in English (except Blanc)

1. Early Life

Details are limited. Born Marie Gouze in Montauban, France in 1748 to petite-bourgeois parents Anne Olympe Moisset Gouze, a maidservant, and her second husband, Pierre Gouze, a butcher, Marie grew up speaking Occitan (the dialect of the region). She was possibly the illegitimate daughter of Jean-Jacques Le Franc de Caix (the Marquis de Pompignan), himself a man of letters and a playwright (among whose claims to fame includes an accusation of plagiarism by Voltaire). In her semi-autobiographical Mémoire de Madame de Valmont sur l’ingratitude et la cruauté de la famille de Flaucourt (Memoir of Mme de Valmont) (1788), Gouges publishes letters purported to be transcriptions from Pompignan taking pains to distance himself from Valmont/Gouges. These letters stop short of unequivocal denial of his paternity.

She was married at 16 or 17 in 1765, unhappily, to Louis Aubrey (an associate of Pierre’s), with whom she had a son (also) Pierre, and by the age of 18 or 19 was widowed.  Denouncing marriage due to her recent past experience and disguising her widowhood (which would have given her a modicum of social and legal status), she adopted her mother’s middle name and the more aristocratic-sounding “de Gouges” and moved to Paris. Literate (schooled likely by Ursuline nuns in Montauban) but not particularly well-read, she spent the next decade informing herself on intellectual and political matters and integrating into Parisian society, supported by Jacques Biétrix de Roziéres, a wealthy weapons merchant, whom she may have met in Montauban shortly after the death of her husband or through her married sister, Jeanne Raynart. Biétrix insured her circumstances until the decline in his family’s resources in 1788. The year of her first published work is 1778, and it marks the end of her first decade in Paris.

She had begun to write in earnest around 1784. A literary compatriot and admirer of hers was Louis-Sébastien Mercier (1740-1814), with whom she shared many political views, including clemency for the King and a general abhorrence of violence. Mercier helped her navigate the tricky internal politics of the Comédie Française—the prestigious national theater of France—assisting her to publish several of her plays, and  to stage a handful. Charlotte Jeanne Be’raud de la Haye de Riou, Marchioness of Montesson (1739-1806), wife of the Duke of Orleans, a playwright herself and a woman of much influence and wealth, was among a list of other friends who came to her aid.

With little formal education and as a woman boldly unconventional, once she began her life of letters, her detractors were eager to find fault. She was often accused of being illiterate, yet her familiarity with Moliere, Paine, Diderot, Rousseau, Voltaire, and many others, the breadth of her interests, and the speed with which she replied to published criticism, all attest to the unlikelihood of the accusation. As French was not her native tongue and since her circumstances permitted, she maintained secretaries for most of her literary career.

2. Intellectual Pursuits

a. Literary

All of her plays and novels carry the theme of her life’s work: indignation at injustice. Her literary pursuits began with playwriting. Gouges wrote as many as 40 plays, as inventoried at her arrest. Twelve of those plays survived, and four found the requisite influential, wealthy, mostly male backing needed for their staging. Ten were published. While many of the plays by the dozen women playwrights that had been staged at the Comédie Française were published anonymously or under male pseudonyms, those playwrights who were successful on stage in their own names (most notably Julie Candielle) stuck to themes seen as suitable to their gender. Gouges broke with this tradition—publishing under her own name and pushing the boundaries of what was deemed appropriate subject matter for women playwrights—and withstood the consequences. Reviews of her early productions were mixed—some fairly favorable, others patronizing and condescending or skeptical of her authorship. Those of her plays read by the Comédie Française were often ridiculed by the actors themselves. Her later plays, more strongly political and controversial, were met with outright sarcasm and hostility by some reviewers: “[t]o write a good play, one needs a beard” wrote one critic.

Her first play, written in 1785, never produced for the stage, but published the following year, L’Homme Généreux [The Generous Man] explored the political powerlessness of women through representation of a socially privileged man’s struggle with sexual desire. The play also shines a light on the injustice of imprisonment for debt. Le Mariage Inattendu de Chérubin [Cherubin’s Unexpected Marriage] (1786), one of the several homages of the time written after Beaumarchais’ critically acclaimed Le Mariage de Figaro [The Marriage of Figaro] (1784), is a sequel to Figaro. Intent to rape is a theme in this play as it was in L’Homme Généreux; a privileged husband’s misplaced lust brings damage to the family, while the suffering of the victim is given significant attention. Gouges’s first staged production was originally titled Zamore et Mirza; ou L’Heureux Naufrage [Zamore and Mirza; or, The Happy Shipwreck] (1788). Written in 1784 and later revised, it was finally performed in 1789 under the title L’Esclavage de Nègres, ou l’Heureux naufrage [Black Slavery; or the Happy Shipwreck]. Accepted by the Comédie Française when submitted anonymously in 1785, it was then shelved for four years once the identity (and gender) of the playwright was confirmed. Winning praise from abolitionist groups, it was the first French play to focus on the inhumanity of slavery; it is, not surprisingly, also the first to feature the first person perspective of the slave. It saw three performances before it was shut down by sabotaging actors and protests organized by enraged French colonists who, deeply reliant on the slave trade, hired hecklers to wreak havoc on the production. Gouges fought back through the press, her social and literary connections and through the National Assembly. Understanding that her gender was connected to her lack of success, she called for a second national theatre dedicated solely to women’s productions, and she called for reforms within the Comédie Française itself.

Le Philosophe corrigé, ou le cocu supposé [The Philosopher Chastised, or the Supposed Cockold] (1787) represents women, despite expectations, as capable of agency around their own sexual desire, and of uniting and supporting each other, giving voice to the phenomenon that Simone de Beauvoir would address much later in The Second Sex with the observation that “women do not say ‘we’.” The play also depicts a male, the titular husband, as capable of acquiring moral knowledge through an evaluation of emotional response. Sympathy for his inexperienced wife and, later, an innocent baby, gives him insights he uses for moral reflection, a theme found in David Hume (1711-1776), Josiah Royce (1855-1916), and much modern feminist ethics. When French theatre in the decade of the Revolution turned to lighter vaudevillian fare, Gouges tried her hand at light comedy; La Bienfaisance récompense ou la vertu couronée [Beneficence Rewarded, or Virtue Crowned] (1788), a one-act comedy, portrayed the then-current Duc d’Orléans, an anti-royalist, as a doer of good.

But most of Gouges’s playwriting was dedicated to crafting principally dramatic—if melodramatic by today’s standards—pieces, responding to the issues of the day. And, she had a unique voice on many matters. Moliére chez Ninon ou e siécle de grands hommes [Moliére at Ninon’s, or the Century of Great Men] (1788), for instance, challenges the double standard between the sexes by depicting the famous, fiercely independent, literary courtesan, Ninon de Lenclos (1620-1705), as a noble person positively influencing the male intellectuals in her circle, including Moliére, all of whom are present to honor a visit by another notable intellectual, Queen Christina of Sweden (1626-1689).

Playwriting for Gouges was a political activity. In addition to slavery, she highlighted divorce, the marriageability of priests and nuns, girls forcibly sent to convents, the scandal of imprisonment for debt, and the sexual double-standard, as social issues, some repeatedly. Such activism was not unheard-of on the stage, but Gouges carried it to new heights. By 1790, her writings had become more explicitly political. She had three plays published:  Les Démocrates et les Aristocrates; ou le Curieux du Champ de Mars [The Democrats and the Aristocrats], a satire of political extremists on both sides; Le Nécessité du Divorce [The Necessity of Divorce], again illustrating the powerlessness of women trapped in marriage, and written simultaneously with a debate on the topic in the National Assembly (France would be the first Western country to legalize divorce two years later); and Le Couvent ou les Vœux Forces [The Convent, or Vows Compelled]. The Convent was her second play to see the light of day, and her greatest success. In the year of its publication it saw approximately 80 performances. Highlighting the political impotence of women, it illuminated the injustice of the Church’s complicity in male relatives’ right to force females into convents against their will.

Gaining momentum on the political front, her next play to be produced for the stage was Mirabeau aux Champs-Elysées [Mirabeau in the Elysian Fields] (1791). Depicting the Enlightenment philosophers Baron d’Montesquieu (1689-1755), Voltaire (1694-1778), and Rousseau, along with Benjamin Franklin (1706-1790), and, Madame Deshoulieres (1638-1694), Marquise de Sévigné (1626-1696) and Ninon de Lenclos—the latter three major female influences in France during the Enlightenment—Mirabeau awakens numerous notable historical figures to welcome Mirabeau (1749-1791) as a hero to the afterlife, in effect honoring his stance as a supporter of constitutional monarchy. Most notable, perhaps, is the appearance of the three women as worthy of a place of honor and a voice, platforms they use to assert, among other things, that the success of the Revolution pivots on the inclusion of women. (This is also the year Gouges wrote The Rights of Woman, discussed separately below.)

In 1792, one year before her trial and execution, she worked on two plays: the unfinished La France Sauvée, ou le Tyran détrôné [France Preserved, or the Tyrant Dethroned] and the completed L’Entrée de Dumourier [sic] à Bruxelles [Dumouriez’s Entry into Brussels].  The former, confiscated at her arrest, was used as proof of sedition at her trial because of its sympathetic depiction of Marie-Antoinette, even as Gouges used it to demonstrate support for her own case.  The latter depicted General Dumouriez’s defense of the Revolution against foreign anti-royalists, assisted by male and female warriors, and challenged her own privileging of aristocracy by suggesting that commoners were the true nobles. It was also the fourth and final play to be staged during Gouges’s lifetime.

The publication of Memoir of Madame de Valmont (1788) ironically begins, rather than summarizes, her political career. This fictionalized self-examination grappled with idealized father figures and fragmented selves, and served to package and compartmentalize her pre-Parisian life and move her forward wholly into a literary existence. Using a version of her (Gouges’s) personal story to make a political point, the narrator of the story sees clearly how gender works to constrain women. Mme de Valmont’s father’s refusal to acknowledge paternity raises issues of legitimacy for Valmont with financial and social repercussions. While he also expressed no patience for women’s forays into traditional male arenas such as writing, the narrator’s solution is to call for women to stop undermining other women and work to support each other. Rejection of the symbolic paternal voice of the culture has political power, and the Memoir presents an 18th century illustration of making the personal political—a vivid theme in 20th century feminism.

Gouges’s second novel Le Prince Philosophe (The Philosopher Prince), also written in 1788 but not published until 1792, reconceives monarchical rule, positing the best kind of ruler as one who would prefer not to rule, and proposing that all rule must be founded on the obligation not to take life. Scholars are mixed on whether she maintained her monarchist stance throughout her life.  Her literary output and her pamphleteering often suggest some version of a monarchy as her default position.  However, Azoulay (2009) suggests, at least in Gouges’s later writings (and perhaps in this second novel as well), her supposed attention to the monarchy is rather an attention to the preservation of the state and to the injustice of taking a life—namely, the King’s. Gouges’s depiction of the cause of the friction between the sexes in this novel is commentary on gender relations that here appears more conservative than will later be the case. The male characters still hesitate to share the reins with women. Yet, women are encouraged to develop reason rather than charm, and one female character submits a carefully drawn up plan advocating education and job opportunities for females, eventually winning a small victory by receiving permission to run a women’s academy.

Gabrielle Verdier (1994) notes several distinguishing features of Gouges’s plays:  (1) young women have active roles to play, (2) women of any age have agency, (3) female rivalry is absent, (4) mature women are protectors, benefactors, and mentors, (5) and the abuses that women experience are inevitably tied to larger social injustices. While not prepared to offer up a fully formed theory of oppression, Gouges is readying the space where that work can be done.

In all of her writings, both literary and political, one finds an unflinching self-confidence and a desire for justice. What Lisa Beckstrand (2009) refers to as the “theme of the global family” also runs through Gouges’s literary work. Familial obligations dominate and are responses to the inadequacies of the state. The plight of the illegitimate child, the unmarried mother, the poor, the commoner (at least by 1792), the orphan, the unemployed, the slave, even the King when he is most vulnerable, are all brought to light, with family connection and sympathy for the most disadvantaged as the pivotal plot points. Women characters regularly displace men at center stage. It is women, unified with each other and winning the recognition of men, that most characterizes what Gouges conveys in her work. She is the first to bring several taboo issues to the stage, divorce and slavery among them. Especially prolific in the four years from 1789 to her death in 1793, her political passion, labeled conservative by many, is still astonishing for its persistence in a culture working strenuously too often to stifle women’s voices.

b. Political

The impact of Olympe de Gouges’s political activism is commemorated by her inclusion as the only French woman on revolutionary and abolitionist Abbé Henri Grégoire’s (1750-1831) list of “all those men [sic] who have had the courage to plead the cause” of abolition.  The list is included in the introduction to his On the Cultural Achievements of Negroes (1808), which was written as a counter to Thomas Jefferson’s less admiring look at race in Notes on the State of Virginia (1782). As with so much that came to prominence with modern feminism, indignation at injustice must have started for Gouges with her own marginalization as a woman, but it shifted to the external world with a recognition of the inhumanity of slavery. While she was not an immediatist like some in the next generation of abolitionists such as William Lloyd Garrison (1805-1879) in the U.S., abolitionist thought permeated her understanding of the world and of herself as a writer, and soon grounded her thinking on women. The French Revolution itself transformed Gouges’s thinking further when the rights of citizens, despite pleas from the Girondins, were not applied to the female citizen. In fact, female political participation of all kinds was formally banned by the French National Assembly in 1793, after one of several uprisings led by women. Gouges’s political tenacity showed itself most virulently in prison where she “mounted the rostrum” at least two final times, smuggling out pamphlets that condemned prison conditions and that accepted—indeed recklessly demanded—responsibility for her ideas, challenging how the rights of freedom of speech were embodied in the new Constitution.

Her path from social nonconformist, to political activist and reformer, to martyr was one untrodden by women.  Many of the most influential eighteenth century intellectuals—with a few exceptions—were convinced women did not have the intellectual capacity for politics. Gouges challenged that perception, while also problematizing it by writing hastily, sometimes dictating straight to the printer. While her uneven education opened her to ridicule, it gave her a critical affinity for Rousseauean ideas, as will be discussed below. Both playwriting and her productivity as a pamphleteer gained her celebrity which she used as a podium for her advocacy of the marginalized, and for drawing attention to the importance of the preservation of the state.

Gouges’s formal petitions to the National Assembly, and her public calls for governmental and social reform through the press and through her pamphleteering (common in France), went far beyond the The Rights of Woman of 1791 (see next section) and her stance on slavery. Among her many calls for change in this medium were: (as mentioned above) a demand for a national theatre dedicated to the works of women; a voluntary tax system (one of the few demands she published anonymously and which saw implementation the following year); state-sponsored working-groups for the unemployed; social services for widows, the elderly and orphans; civil rights for illegitimate children and unmarried mothers; suppression of the dowry system; regulation of prostitution; sanitation; rights of divorce; rights to marriage for priests and nuns; people’s juries for criminal trials; and the abolition of the death penalty.  She petitioned the National Assembly on a number of occasions on these and other matters. Whether or not related to her efforts, the National Assembly did pass laws in 1792 giving illegitimate children some of the civil rights for which she fought and granting women the right to divorce, even while women remained legal nonentities overall.

She is frequently touted as a (sometimes the) founder of modern feminism for her unrelenting advocacy for women’s rights in writing and in action. While the historical record is more complicated than that, in historian John R. Cole’s accounting (2011), “she published on current affairs and public policy more often and more boldly than any other woman. . . and [s]he made a more formal and sweeping demand for the extension of full civil and political rights to women than any prior person, male or female, French or foreign” (231). Her call for women to identify as women and band together in support of each other can also be considered a contribution to the revolutionary and to the concept of citizenship, and remains today an important focus for modern feminism. Others in France were also proposing feminist ideas, although none as actively and comprehensively as Gouges: most notably, François Poulain de la Barre (1647-1725) in the 17th century and Marie Madeleine Jodin (1741-90), Louise de Keralio-Robert (1758?-1822), Nicholas de Condorcet (1743-1794), and Etta Palm d’Aelders (Dutch, 1732-1799), in the 18th century. The latter two petitioned the National Assembly unsuccessfully in 1790 to ensure legal rights for women.

Gouges’s writings share many themes with what would become classic texts in the feminist movement of the 20th century:  independence of mind and body, access to political rights and political voice, education, and elimination of the sexual double standard.  Her awareness of these themes spring from her experience as a woman, solidified by her unhappy early marriage, her unapologetic and ostensibly scandalous first years in Paris, through to her sometimes-thwarted, oft-derided, attempts at participation in cultural, literary and political realms. She experienced firsthand how the rights of the citizen were denied women. Her early history, her frustration at being denied, or dismissed as, a voice in the public sphere, and the ridicule she withstood, aimed at her gender, gave shape to insights emblematic of much later feminist theory and concretized for her an understanding of the link between the public and private realms. Gouges contributed markedly to the depth and breadth of the discourse on women’s rights in late 18th century France, and on the plight of the underprivileged in general. As Cole summarizes: “she tried to rally other women behind a radical extension of liberty and equality into domestic relationships . . . and she [advocated for] the extension of rights to free persons of color and free blacks and to vindicate the full humanity of slaves in the Caribbean colonies” (231).

Perhaps most indicative of Gouges’s political courage and intellectual self-reliance was the stance which led to her death. Her decision to continue to publish works deemed seditious even as the danger of arrest grew shows courage and commitment to her advocacy of the less fortunate and exemplifies her self-definition as a political activist. She was the only woman executed for sedition during the Reign of Terror (1793-1794). That fact, along with comments such as those of Pierre Chaumette: “[r]emember the shameless Olympe de Gouges, . . . who abandoned the cares of her household to involve herself in the republic, and whose head fell under the avenging blade of the laws. Is it for women to make motions? Is it for women to put themselves at the head of our armies?” (Andress, 234), suggests that her outspoken views were greeted with increased hostility because of her gender. In May or June of 1793 her poster The Three Urns [or Ballot Boxes] appeared, calling for a referendum to let the people decide the form the new government should take. Proposing three forms of government: republic, federalist or constitutional monarchy, the essay was interpreted as a defense of the monarchy and used as justification for her arrest in September. Her continuing preference for a constitutional monarchy was likely propelled in part by her disappointment with the Revolution, but more specifically by her opposition to the death penalty and her general humanitarian inclinations. She appears to have had no elemental dispute with monarchy per se, problematizing any philosophical understanding of her commitment to human rights. The Rights of Woman, for instance, is dedicated to the Queen—as a woman, but presumably because she is the Queen.

c. The Rights of Woman (1791)

By far her most well-known and distinctly feminist work, The Rights of Woman (1791) was written as a response to The Declaration of the Rights of Man and of the Citizen, written in 1789 but officially the preamble to the French Constitution as of September 1791. Despite women’s participation in the Revolution and regardless of sympathies within the National Assembly, that document was the death knell for any hopes of inclusion of women’s rights under the “Rights of Man.” The patriarchal understanding of female virtue and sexual difference held sway, supported by Rousseau’s perspective on gender relations, perpetuating the view that women’s nurturing abilities and responsibilities negated political participation. Political passivity was itself seen as a feminine responsibility.

The Rights of Woman appeared originally as a pamphlet printed with five parts:  1) the dedication to the Queen, 2) a preamble addressed to “Man,” 3) the Articles of the Declaration, 4) a baffling description of a disagreement about a fare between herself and a cab driver, and 5) a critique of the marriage contract, modeled on Rousseau’s Social Contract. It most often appears (at least in English translation) without the fourth part (Cole, 2011). Forceful and sarcastic in tone and militant in spirit, its third section takes up each of the seventeen Articles of the Preamble to the French Constitution in turn and highlights the glaring omission of the female citizen within each article. Meant to be a document ensuring universal rights, the Declaration of the Rights of Man and the Citizen is exposed thereby as anything but. The immediacy of the implications of the Revolution finally fully awakened Gouges to the ramifications of being denied equal rights, but her entire oeuvre was aiming in this direction. Gouges wrote a document that highlights her personal contradictions (her own monarchist leanings as they hinder full autonomy most obviously), while bringing piercing illumination to contradictions in the French Constitution. Despite the lack of attention Gouges’s pamphlet received at the time, her greatest contribution to modern political discourse is the highlighting of the inadequacy of attempts at universality during the Enlightenment. The demands contained within the original document assert the universality of “Man” while denying the specificity required for “Woman,” therefore collapsing—at least logically—of its own efforts. Alert to the powerlessness of women and the injustice such a condition implies, Article 4 of The Rights of Woman, for instance, particularly calls for protection from tyranny, as “liberty and justice” demand; that is, as nature and reason demand in personal as well as political terms. To harmonize this document with her devotion to the monarchy for most of her political career takes significant effort.

For Gouges, the most important expression of liberty was the right to free speech; she had been exercising that right for almost a decade. Access to the rostrum required more than an early version of “add women and stir.” While Gouges’s Rights is rife with such pluralizing—extending any right of Man to Woman as well—there is also a clear acknowledgement that blind application of universal principles is insufficient for the pursuit of equality. Article XI, for example, demands the right of women to name the father of their children. The peculiarity of the need for this right on the part of women stands out because of its specificity and demonstrates the contradictions created by blindness to gender. The citizen of the French Revolution–the idealistic universal—is the free white adult male, leaving in his wake many injustices peculiar to individuals excluded from that “universal.” The Rights of Woman unapologetically highlights that problem.

The Enlightenment presumption of the “natural rights” of the citizen (as in “inalienable rights” in the U.S. Declaration of Independence) is in direct contradiction to the equally firmly-held belief in natural sexual differences—both of which are so-called “founding principles of nature.” While Gouges is not fully aware of the implications of this conflict, she holds unequivocally in The Rights of Woman that those natural rights do indeed grant equality to all, just as the French Declaration states but does not intend. The rights such equality implies need to be recognized as having a more far-reaching application; if rights are natural and if these rights are somehow inherent in bodies, then all bodies are deserving of such rights, regardless of any particularities, like gender or color.

Marriage, as the center for political exploitation, is thoroughly lambasted in the postamble, Part 5, to The Rights of Woman. Gouges describes marriage as the “tomb of trust and love,” and the place of “perpetual tyranny.” The primary site of institutionalized inequality, marriage creates the conditions for the development of women’s unreliability and capacity for deception. Just as Mary Wollstonecraft (1759-1797) does in A Vindication of the Rights of Woman (1792), Gouges points to female artifice and weakness as a consequence of woman’s powerless place in this legalized sexual union. Gouges, much like Wollstonecraft, attempts to combat societal deficiencies: the vicious circle which neglects the education of its females and then offers their narrower interests as the reason for the refusal of full citizenship. Both, however, see the resulting fact of women’s corruption and weak-mindedness as a major source of the problems of society, but herein also lies the solution. Borrowing from Rousseau, Gouges proposes a “social contract” as replacement for traditional marriage, reformulating his social contract with a focus that obliterates his gendered conception of citizen, and create the conditions for both parties to flourish. In the “Form for a Social Contract Between Man and Woman,” Gouges offers a kind of civil union based on equality, which will create the “moral means of achieving the perfection of a happy government!”. The state is a (reconceived) marriage writ large, for Gouges. What ails government are fixed social hierarchies impossible to maintain.  What heals a government is an equal balance of powers and a shared virtue (consistent with her continuing approval of a constitutional monarchy). Marriages are to be voluntary unions by equal rights-bearing partners who hold property and children mutually and dispense of same by agreement. All children produced during this union have the right to their mother’s and father’s name, “from whatever bed they come.”

Having been a monarchist almost to the end, her authorship of this document and her lack of formal education suggests Gouges possessed less than full comprehension of what we now view as the discourse on universal human rights. That said, the production of this document has influenced exactly that conversation, and thus her presence in the list of historical figures who matter philosophically has to be acknowledged. Even Wollstonecraft, in her Vindication of 1792, does not call for the complete reinvention of women as political selves, as does Gouges in 1791.

d. Philosophical

As with most Enlightenment thinking, a natural rights tradition—although not any kind of comprehensive theory of natural rights—can be found in Gouges’s views on the origins of citizenship and rights for women and blacks. By 1791, she is arguing that equality is natural; it “had only to be recognized.” It appears that Gouges did not see any contradiction in her royalist leanings; nevertheless, she may no longer have been a monarchist by this point. She held that the human mind has no sex, an idea traceable in the modern era as far back as Poulain de la Barre (1673); men and women are equally human, therefore capable of the same thoughts. While her lack of education precluded the use of any systematic methodology, the consistency of her advocacy for the powerless, of pacifism, and (eventually) for the universal application of moral and legal rights is of great merit, and remains, if not based in rigorous philosophical analysis, yet philosophically astute. Her writings, both literary and political, point in directions contemporary feminist philosophy traversed for much of the twentieth century and beyond. She presciently foreshadows the “masculine universal” of liberal democracy identified by much contemporary feminist thought. She rejected the perceptions of sexual difference used to drive women out of the political arena, while she advocated for women’s “special interests.” Echoing Plato’s attention to gender in The Republic, she saw natural differences between genders, but not of a kind relevant to the tasks of the citizens of the state.  Despite appearing after the French constitution being decreed and constitutionally frozen, Gouges’s The Rights of Woman was aimed at expanding, even supplanting, the official French Declaration.  Focusing on women as human and thus equal, but with pregnancy and motherhood as special differences, Gouges seemed comfortable with the resulting conceptual dissonance.

On the heels of The Rights of Woman, she published The Philosopher Prince (1792), a novel where ideas in the realm of political philosophy (perhaps influenced by the historical events of the previous three years) are most on display. With the marriage contract from The Rights of Woman as a template, she unpacks reasons for the lack of solidarity between the sexes; she depicts women living in a mythical society where education becomes a requirement for civic virtue; access to reason is necessary so that women grow up equal to men and engaged in public life. Azoulay (2009) gives the most scholarly attention to date to this novel, arguing that it provides evidence that Gouges was a monarchist only insofar as monarchy was the best means to preserve the nation. “Gouges did not seek the preservation of the monarchy but rather the revival of the kingdom” (43).

Earlier, in 1789, the pamphlet Le Bonheur primitif de l’homme [The Original Happiness of Man] also gives hints of a political philosophy. Gouges imagines a society where women were granted an education and encouraged in the development of their agency. Individual happiness depends on collective happiness, but collective happiness comes from the natural qualities found within families (Beckstrand’s “theme of the global family”). While agreeing with Rousseau that civilization corrupts, she parts ways with him on the education of females. While never directly critical of Rousseau, the implication here is that the corruptibility of civilization can be countered only if we raised “Sophie” within a nurturing egalitarian family with as much freedom and natural exploration as Rousseau proposes for “Emile.” The happiness of all requires, among many other things, that “[g]irls will go to the fields and guard the animals” (tr. Harth, 1992, 222). This pamphlet also contains her call for a national theatre for women.

We find fragments of a larger philosophical perspective wherever we look.  The female as subject rather than object, especially in political discourse, is among the most important and prevalent. Her understanding of the value of her own voice creates an understanding of self that challenges gender norms head on, withstands all public criticism, and refuses to collapse under the weight of taboo. A nascent moral philosophy can be unearthed by considering her lifelong attention to the plight of the disadvantaged. An early example is the postface to the second printing of her first play, written prior to its first staging, but after a long battle with the Comédie Française to have it staged. Réflexions sur les hommes négres [Reflections on Black Men] raises questions about personhood and race. Gouges identifies race as a social construct insofar as slavery condemns blacks to being bought and sold “like cows at market.” She is horrified at what privileged men will do in the name of profit. “It is only color” that differentiates the African from the European. And, difference in color is simply the beauty of nature. If “they are animals, are we not like they?” There is explicit criticism here of a binary way of seeing the world.

An atheist, she critiques religion—particularly Catholicism—by focusing on its oppressiveness, especially towards women. Religion should not prohibit one from listening to reason or encourage one to be “deaf to nature.” The celibacy of priests and nuns lays the ground for corruption and plays a role in religion’s oppressiveness as well, she proclaimed. Throughout her writings, respect for the individual appears more vividly than Enlightenment philosophers generally could conceive, grounds her pacifism, inspires her attention to children, and underscores her political vision. And, in part, through her reverence for Rousseau, she sees problems with the separation, both devastating in its implications in practice and invigorating in its theoretical possibilities, between the private and the public spheres.

i. Gouges and Rousseau

The writings of Jean-Jacques Rousseau (1712-1778) were a major influence on the French Revolution, as was the then-recent success of the American Revolution (1776). Among Gouges’s lost plays is one titled Les Rêveries de Rousseau, la Mort de Jean-Jacques à Ermenonville [Reveries of Rousseau, the Death of Jean-Jacques of Ermenonville] (1791)—she was an ardent admirer, calling him her “spiritual father”. While it is clear Rousseau’s philosophy as a systematizable whole was ignored by the revolutionaries, his idea that the power of the government should come from “the consent of the governed” inspired the overthrow of an absolutist monarchy. His advocacy of rule by the general will helped inspire the French to shed their monarchist allegiances and take to the streets.  His theory of education for boys promoted non-interference and encouraged conditions that would allow nature to take its course. That this was the most direct route to virtue and would produce the best kind of self was aimed at the man a boy like Emile would become, and was taken seriously by men and women alike in France and beyond.

Gouges described herself as a “pupil of pure nature,” embracing a Rousseauian perspective on education while imposing on it her own perspective on gender. The education Rousseau proposed for girls was mind-numbingly stifling; they were to be raised to understand they were “made for man’s delight.”

“They must be subject, all their lives, to the most constant and severe restraint, which is that of decorum: it is, therefore, necessary to accustom them early to such confinement, that it may not afterwards cost them too dear; and to the suppression of their caprices, that they may the more readily submit to the will of others” (Emile, or On Education, V:1297).

Rousseau claimed virtues for the male arose most purely when the individual was not constrained by civilization. He proposed that boys turn out best when left to themselves. He advocated freedom for Emile. But when it came to females, a system of cultural constraints was necessary in order to ensure the properly compliant nature for a companion for Emile. “[A]mong men, opinion is the tomb of virtue; among women it is the throne” (Emile, V:1278). Universal principles and the “masculine virtues” applied only to those dominant men.  In terms of gender, Rousseau was influencing the Revolution in just the way Gouges was finding fault with it.

Gouges agreed with Rousseau’s understanding of how education of the citizenry could transform society. Yet, seeing well beyond Rousseau in terms of gender, she proclaimed in The Rights of Woman that the failure of society to educate its women was the “sole cause” of the corruption of government. Gouges, in a number of documents, as has been noted, anticipated contemporary feminist philosophy’s claim that modern liberal democracy operated with a deficient notion of the universal because it lacked inclusiveness. Historically, woman is seen as the complementary and contrasting counterbalance to man; if man is a political animal, woman is a domestic one. While the prominent French revolutionaries worked to apply Rousseauean themes to, among other things, the exclusion of women from political participation, Gouges fought to raise awareness of what is largely the misapplication of Rousseau’s social critique while never naming Rousseau. According to her, if social systems are human-made and they tend to cause the evils of the world because they interfere with nature, then just as it is for males, so must it be for females. Despite his intentions, Rousseau’s egalitarian vision had immense feminist implications. Men’s tyranny over women, for Gouges, is clearly contrary to nature, not in sync with it. Her use of “social contract” in the postscript to the Declaration is a direct appropriation of Rousseau. Her social contract proclaims that the right in marriage to equal property and parental and inheritance rights is the only way to build a society of “perfect harmony.” As with Beauvoir a century and a half later, Gouges calls women to take responsibility for their condition, and demand equality. She conceived of it in the masculine, and applied it to herself. She accepted Rousseau’s understanding of “nature” and his discourse on rights. She wrote The Original Happiness of Man in 1789 as a demonstration of her debt to Rousseau. There, she acknowledges her lack of formal education (as she often did in her writings), suggesting that she could see some things more clearly because of that deficit (she is “at once placed and displaced in this enlightened age . . .”). Her freedom comes from her lack of constraint, originating, in part, in her lack of formal education. The artificial constraints she encounters are unjustly thrust upon her by her society.

Rousseau’s condemnation of class distinctions also spoke directly to Gouges’s experience. But he did not question the right of the sovereign over the governed and Gouges, despite her monarchism, does at times do so with vigor. (When that sovereign is in one’s own household–all the more need for vigor.) For instance, in her pamphlet containing a proposal for a female national guard (Sera-t-il Roi, ne le sera-t-il pas? [Will he, or will he not, be King?] (1791), she criticizes a sovereign who would ask its citizenry to go to war, suggesting that such a request contradicts the very essence of citizenship. Being a citizen requires one to honor one’s relationship to the state, not to deliberately put that relationship in jeopardy. At the center of an understanding of political life should be a commitment not to take life–that is, to preserve the polis as a whole.

3. Relevance and Legacy

In addition to the political activism just mentioned, Gouges foreshadowed Henry David Thoreau (1817-1862), Mahatma Gandhi (1869-1948) and Martin Luther King, Jr. (1929-1968), by calling for disobedience to obviously unjust laws. Her argument for protections for the deposed French king comes, not so much from her royalist tendencies, but from her understanding of the “global family” and from her pacifism, as well as from her understanding of the separability of sovereign power from the individual who inhabits that power. Once the sovereignty is removed, the individual, she believed, was no longer synonymous with that figurehead. Putting to death the man who held that title but has since relinquished it, is a miscarriage of justice.

Through the several articles of the third part of The Rights of Woman she helped to problematize the notion of universalizability as a moral good, and has helped to formulate and to popularize the notion that the word ‘man’ when used generically may be problematic

Gouges critiqued the principle of equality touted in France because it gave no attention to who it left out, and she worked to claim the rightful place of women and slaves within its protection.  She moved Querelle des Femmes (“the woman question”)—which had its origin in France in the middle ages and pivoted around what role(s) women should rightly play in society—out of the abstract and into the political arena. She moved the discussion of slavery from an abstract distant one (an issue for the colonies only) literally to center stage and specifically highlighted the moral irrelevance of color. The “color of man is nuanced,” she wrote; and she  questioned why, if blonds are not superior to brunettes, and mulattos not superior to Negroes, how whites can be any different.  Color cannot be a criterion for dehumanization. Her challenge to traditional binaries wherever she found them may be the culminating arc of her work and where we can find our greatest debt to her. Her fictional characters all strain against the straightjackets of their identities:  strong women vie for their independence in conversation with sympathetic men rather than pitting themselves against each other in rivalry over men; men and women bond together to right some significant wrong; women seek strength in other women and unify to accomplish morally worthy goals; men often relinquish their arbitrary right to power over women to work in tandem to accomplish just goals. Men are depicted applauding women’s success. Her political pamphlets demonstrate her commitment to an overhaul of society. Revolutions, she insisted, could not succeed without the inclusion of women. And, since blacks demonstrated their humanity with every step and every breath in her plays, their enslavement was an indictment of French society.

While silenced for a time by history, Gouges scholarship has increased steadily since the publication of Oliver Blanc’s first biography in 1981 (no English translation; supplanted by his 2003 publication). A singular figure in the French Revolution and a founding influence on the direction of women’s and human rights, Gouges’s resistance to gendered social norms and her insistence on the revolutionary nature of the application of women’s rights makes her an important historical figure. If not herself a philosopher, she had the stamina and the intellect to shape ideas that have been and continue to be philosophically relevant and valuable. Her ability to attain status and power and the public rostrum despite her background and her gender is astonishing. Her refusal to be silent in the face of injustices, both personal and social, contains the roots of her legacy.

4. References and Further Reading

a. Extant Works by Olympe de Gouges (in French)

  • Théâtre politique I, preface by Gisela Thiele-Knobloch (Paris: Côté-femmes éditions, 1992)–includes Le Couvent, Mirabeau aux Champs-Elysées, and L’Entrée de Dumourier à Bruxelles.
  • Oeuvres complétes Théâtre, Félix-Marcel Castan, ed. Montauban: Cocagne, 1993 (comprises the twelve extant plays, including two in manuscript).
  • Théâtre politique II, preface by Gisela Thiele-Knobloch (Paris: Côté-femmes éditions, 1993)—comprises L’Homme généreux, Les Démocrates et les aristocrates, La Nécessité du divorce, La France sauvée ou le tyran détrôné, and Le Prélat d’autrefois, ou Sophie et Saint-Elme.
  • Ecrits politiques, 1788-1791, volume 1, preface by Olivier Blanc (Paris: Côté-femmes éditions, 1993).
  • Action héroïque d’une françoise, ou La France sauvée par les femmes. Paris: Guillaume Junior, 1789.
  • L’entrée de Dumouriez à Bruxelles, ou Les vivendiers. gallica.bnf.fr.
  • Ecrits politiques, 1792-1793, volume 2, preface by Blanc (Paris: Côté-femmes éditions, 1993).
  • Mémoire de Madame de Valmont, 1788, roman (Paris: Côté-femmes éditions, 1995).
  • La France sauvée, ou Le tyran détrôné. //gallica.bnf.fr.
  • Mon dernier mot à mes chers amis. //gallica.bnf.fr.
  • Oeuvres, ed. Benoite Groult.  Paris:  Mercure de France, 1986.
  • Repentir de Madame de Gouges. 1791, //gallica.bnf.fr.
  • Les fantômes de l’opinion publique, 1791? //gallica.bnf.fr.
  • Pour sauver la patrie, il faut respecter les trois ordes : c’est le seul moyen de conciliation qui nous reste. (1789). //gallica.bnf.fr.
  • Le cri du sage, par une femme. (1789 ?). //gallica.bnf.fr.
  • Dialogue allégorique entre la France et la vérité, dédié aux états généraux. 1789. //gallica.bnf.fr.

b. On-line English Translations of Gouges’s Original Works

  • On-going translation project by Clarissa Palmer:  Available at www.olympedegouges.eu.
  • The Rights of Woman (1791) [titled here Declaration of the Rights of Women and the Female Citizen, 1791]: www.fordham.edu/halsall/mod/1791degouge1.asp.
  • Transcript of her trial:  chnm.gmu.edu/revolution/d/488/.
  • “Reflections on Negroes” (trans. Sylvie Molta) www.uga.edu/slavery/texts/literary_works/reflections.pdf.
  • “Response to the American Champion” (trans. Maryann DeJulio) www.uga.edu/slavery/texts/literary_works/reponseenglish.pdf.
  • Additional material: www.uga.edu/slavery/texts/other_works.htm#1789.

c. Secondary Sources in English (except Blanc)

  • Andress, David. The Terror: the Merciless War for Freedom in Revolutionary France. New York: Farrar, Straus and Giroux, 2006.
  • Azoulay, Ariella. “The Absent Philosopher-Prince: Thinking Political Philosophy with Olympe de Gouges.” Radical Philosophy 158 (Nov/Dec 2009).
  • Beckstrand, Lisa.  Deviant Women of the French Revolution and the Rise of Feminism. Verlag:  Associated University Presse, 2009.
  • Beauvoir, Simone de. The Second Sex. Trans. H. M. Parshley. Vintage Books (Random House) (1989) [1952].
  • Blanc, Oliver. Une Femme de Libertés: Olympe de Gouges. Syros: Alternatives, 1989.
  • Blanc, Oliver. Marie-Olympe de Gouges: Une Humaniste à la fin du XVIIIe siècle.  Paris; Editions René Viénet, 2003.
  • Brown, Gregory S. “The Self-Fashionings of Olympe de Gouges, 1784-1789.” Eighteenth-Century Studies, 34(3) (2001), 383-401.
  • Cole, John. Between the Queen and the Cabby. Montreal, Quebec, Canada:  McGill-Queen’s University Press, 2011 (contains the only full length translation of all five parts of The Rights of Woman).
  • Diamond, Marie Josephine. “The Revolutionary Rhetoric of Olympe de Gouges.” Feminist Issues, 14(1) (1994), 3-23.
  • Fraisse, Genevieve, and Michelle Perrot, eds. A History of Women in the West, Vol. 4: Emerging Feminism from Revolution to World War. Cambridge, MA: Harvard University Press, 1993.
  • Garrett, Aaron.  “Human Nature.” Cambridge History of Eighteenth-century Philosophy, Volumes 1, ed. Knud Haakonssen.  New York, NY:  Cambridge University Press, 2006.  160-233.
  • Green, Karen.  A History of Women’s Political Thought in Europe, 1700-1800. New York, NY:  Cambridge University Press, 2014.  (Especially Chapter 9: “Anticipating and experiencing the revolution in France,” 203-234.)
  • Groult, Benoîte, ed. (French) “Olympe de Gouges: la première feminist moderne,” in Olympe de Gouges: Oeuvres. Paris: Mercure de France, 1988.
  • Harth, Erica. Cartesian Women: Versions and Subversions of Rational Discourse in the Old Regime.  Ithaca, NY:  Cornell University Press, 1992.
  • Levy, Darline Gay, Harriet Branson Applewhite and Mary Durham Johnson. Women in Revolutionary Paris: 1789-1795: Selected Documents Translated with Notes and Commentary. Urbana: University of Illinois Press, 1979.
  • Mattos, Rudy Frederic de. The Discourse of Women Writers in the French Revolution: Olympe de Gouges and Constance de Salm. 2007.  Unpublished dissertation.  University of Texas at Austin Electronic Theses and Dissertations.
  • Maclean Marie.  “Revolution and Opposition:  Olympe de Gouges and the Déclaration des droits de la femme.” In Literature and Revolution. Ed. David Beven. Amsterdam: Rodopi, 1989.
  • Melzer, Sara E. and Leslie W. Rabine, eds. Rebel Daughters:  Women and the French Revolution. New York: Oxford University Press, USA, 1992.
  • Miller, Christopher L. The French Atlantic Triangle:  Literature and Culture of the Slave Trade.  Durham, NC:  Duke University Press, 2008.
  • Monedas, Mary Cecilia.  “Neglected Texts of Olympe de Gouges, Pamphleteer of the French Revolution of 1789,” Advances in the History of Rhetoric 1.1 (1996).  43-54.
  • Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994.
  • Mousset, Sophie.  Women’s Rights and the French Revolution: a Biography of Olympe de Gouges. New Brunswick, NJ: Transaction Publishers, 2007.
  • Nielson, Wendy C. “Staging Rousseau’s Republic:  French Revolutionary Festivals and Olympe de Gouges.” Eighteenth-Century: Theory and Interpretation, 43(3) (Fall 2003), 265-85.
  • Nielson, Wendy C. Women Warriors in Romantic Drama. Lanham, MD: The University of Delaware Press, 2013.
  • O’Neill, Eileen. “Early Modern Women Philosophers and the History of Philosophy.” Hypatia 20(3) (2005), 185-197.
  • Scott, Joan Wallach. “French Feminists and the Rights of ‘Man’: Olympe de Gouges’s Declarations.” History Workshop 28 (1989), 1-21.
  • Scott, Joan Wallach. Only Paradoxes to Offer:  French Feminists and the Rights of Man. Cambridge, MA: Harvard University Press, 1996.
  • Sherman, Carol L.  Reading Olympe de Gouges. New York, NY:  Palgrave Macmillan, 2013.
  • Spencer, Samia I., ed.  French Women and the Age of Enlightenment. Bloomington, IN: Indiana University Press, 1984.
  • Trouille, Mary Seidman. “Eighteenth-Century Amazons of the Pen: Stéphanie de Genlis & Olympe de Gouges.” Femmes Savants et Femmes d’Esprit: Women Intellectuals of the French Eighteenth Century. Eds. Roland Bonnel and Catherine Rubinger. New York: Peter Lang, 1994.
  • Trouille, Mary Seidman. Sexual Politics in the Enlightenment: Women Writers Read Rousseau. Albany: State U of New York Press, 1997.
  • Vanpée, Janie.  La Déclaration des Droits de la Femme: Olympe de Gouges’s Re-Writing of La Déclaration des Droits de l’Homme. In Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994. 55-80.
  • Verdier, Gabrielle. “From Reform to Revolution: The Social Theater of Olympe de Gouges.” In Montfort-Howard, Catherine, ed. Literate Women and the French Revolution of 1789. Birmingham, AL: Summa Publications, 1994. 189-224.

 

Author Information

Joan Woolfrey
Email: jwoolfrey@wcupa.edu
West Chester University of Pennsylvania
U. S. A.

Kant: Philosophy of Mind

KantImmanuel Kant (1724-1804) was one of the most important philosophers of the Enlightenment Period (c. 1650-1800) in Western European history. This encyclopedia article focuses on Kant’s views in the philosophy of mind, which undergird much of his epistemology and metaphysics. In particular, it focuses on metaphysical and epistemological doctrines forming the core of Kant’s mature philosophy, as presented in the Critique of Pure Reason (CPR) of 1781/87 and elsewhere.

There are certain aspects of Kant’s project in the CPR that should be very familiar to anyone versed in the debates of seventeenth century European philosophy. For example, Kant argues, like Locke and Hume before him, that the boundaries of substantive human knowledge stop at experience, and thus that we must be extraordinarily circumspect concerning any claim made about what reality is like independent of all possible human experience. But, like Descartes and Leibniz, Kant thinks that central parts of human knowledge nevertheless exhibit characteristics of necessity and universality, and that, contrary to Hume’s skeptical arguments, there is good reason to think so.

Kant carries out a ‘critique’ of pure reason in order to show its nature and limits, thereby curbing the pretensions of various metaphysical systems articulated on the basis that reason alone allows us to scrutinize the depths of reality. But Kant also argues that the legitimate domain of reason is more extensive and more substantive than previous empiricist critiques had allowed. In this way Kant salvages (or attempts to) much of the prevailing Enlightenment conception of reason as an organ for knowledge of the world.

This article discusses Kant’s theory of cognition, including his views of the various mental faculties that make cognition possible. It distinguishes between different conceptions of consciousness at the basis of this theory of cognition and explains and discusses Kant’s criticisms of the prevailing rationalist conception of mind, popular in Germany at the time.

Table of Contents

  1. Kant’s Theory of Cognition
    1. Mental Faculties and Mental Representation
      1. Sensibility, Understanding, and Reason
      2. Imagination and Judgment
    2. Mental Processing
  2. Consciousness
    1. Phenomenal Consciousness
    2. Discrimination and Differentiation
    3. Self-Consciousness
      1. Inner Sense
      2. Apperception
    4. Unity of Consciousness and the Categories
  3. Concepts and Perception
    1. Content and Correctness
    2. Conceptual Content
    3. Conceptualism and Synthesis
    4. Objections to Conceptualism
  4. Rational Psychology and Self-Knowledge
    1. Substantiality (A348-51/B410-11)
    2. Simplicity (A351-61/B407-8)
    3. Numerical Identity (A361-66/B408)
    4. Relation to Objects in Space (A366-80/B409)
      1. The Immediacy Argument
      2. The Argument from Imagination
    5. Lessons of the Paralogisms
  5. Summary
  6. References and Further Reading
    1. Kant’s Works in English
    2. Secondary Sources

1. Kant’s Theory of Cognition

Kant is primarily interested in investigating the mind for epistemological reasons. One of the goals of his mature “critical” philosophy is articulating the conditions under which our scientific knowledge, including mathematics and natural science, is possible. Achieving this goal requires, in Kant’s estimation, a critique of the manner in which rational beings like ourselves gain such knowledge, so that we might distinguish those forms of inquiry that are legitimate, such as natural science, from those that are illegitimate, such as rationalist metaphysics. This critique proceeds via an examination of those features of the mind relevant to the acquisition of knowledge. This examination amounts to a survey of the conditions for “cognition” [Erkenntnis], or the mind’s relation to an object. Although there is some controversy about the best way to understand Kant’s use of this term, this article will understand it as involving relation to a possible object of experience, and as being a necessary condition for positive substantive knowledge (Wissen). Thus to understand Kant’s critical philosophy, we need to understand his conception of the mind.

a. Mental Faculties and Mental Representation

Kant characterizes the mind along two fundamental axes – first by the various kinds of powers which it possesses and second by the results of exercising those powers.

At the most basic explanatory level, Kant conceives of the mind as constituted by two fundamental capacities [Fähigkeiten], or powers, which he labels “receptivity” [Receptivität] and “spontaneity” [Spontaneität]. Receptivity, as the name suggests, constitutes the mind’s capacity to be affected by something, whether itself or something else. In other words, the mind’s receptive power essentially requires some external prompt to engage in producing “representations” [Vorstellungen], which are best thought of as discrete mental events or states, of which the mind is aware, or in virtue of which the mind is aware of something else (it is controversial whether representations are objects of ultimate awareness or are merely a vehicle for such awareness). In contrast, the power of spontaneity needs no such prompt. It is able to initiate its activity from itself, without any external trigger.

These two capacities of the mind are the basis for all (human) mental behavior. Kant thus construes all mental activity either in terms of its resulting from affection (receptivity) or from the mind’s self-prompted activity (spontaneity). From these two very general aspects of the mind Kant then derives three further basic faculties or “powers” [Vermögen], termed by Kant “sensibility” [Sinnlichkeit], “understanding” [Verstand], and “reason” [Vernunft]. These faculties characterize specific cognitive powers. These powers cannot be reduced to any of the others, and each is assigned a particular, cognitive task.

i. Sensibility, Understanding, and Reason

Kant distinguishes the three fundamental mental faculties from one another in two ways. First, he construes sensibility as the specific manner in which human beings, as well as other animals, are receptive. This is in contrast with the faculties of understanding and reason, which are forms of human, or all rational beings, spontaneity. Second, Kant distinguishes the faculties by their output. All of the mental faculties produce representations. We can see these distinctions at work in what is generally called the “stepladder” [Stufenleiter] passage from the Transcendental Dialectic of Kant’s major work, the Critique of Pure Reason (1781/7). This is one of the few places in the entire Kantian corpus where Kant explicitly discusses the meanings of and relations between his technical terms, and defines and classifies varieties of representation.

The genus is representation (representatio) in general. Under it stand representations with consciousness (perceptio). A perception [Wahrnehmung], that relates solely to a subject as a modification of its state, is sensation (sensatio). An objective perception is cognition (cognitio). This is either intuition or concept (intuitus vel conceptus). The first relates immediately to the object and is singular; the second is mediate, conveyed by a mark, which can be common to many things. A concept is either an empirical or a pure concept, and the pure concept, insofar as it has its origin solely in the understanding (not in a pure image of sensibility), is called notio. A concept made up of notions, which goes beyond the possibility of experience, is an idea or a concept of reason. (A320/B376–7).

As Kant’s discussion here indicates, the category of representation contains sensations [Empfindungen], intuitions [Anschauungen], and concepts [Begriffe]. Sensibility is the faculty that provides sensory representations. Sensibility generates representations based on being affected either by entities distinct from the subject or by the subject herself. This is in contrast to the faculty of understanding, which generates conceptual representations spontaneously – i.e. without advertence to affection. Reason is that spontaneous faculty by which special sorts of concepts, which Kant calls ‘ideas’ or ‘notions’, may be generated, and whose objects could never be met with in “experience,” which Kant defines as perceptions connected by fundamental concepts. Some of reason’s ideas include those concerning God and the soul.

Kant claims that all the representations generated via sensibility are structured by two “forms” of intuition—space and time—and that all sensory aspects of our experience are their “matter” (A20/B34). The simplest way of understanding what Kant means by “form” here is that anything one might experience will have either have spatial features, such as extension, shape, and location, or temporal features, such as being successive or simultaneous. So the formal element of an empirical intuition, or sense perception, will always be either spatial or temporal. Meanwhile, the material element is always sensory (in the sense of determining the phenomenal or “what it is like” character of experience) and tied either to one or more of the five senses or the feelings of pleasure and displeasure.

Kant ties the two forms of intuition to two distinct spheres or domains, the “inner” and the “outer.” The domain of outer intuition concerns the spatial world of material objects while the domain of inner intuition concerns temporally ordered states of mind. Space is thus the form of “outer sense” while time is the form of “inner sense” (A22/B37; cf. An 7:154). In the Transcendental Aesthetic, Kant is primarily concerned with “pure” [rein] intuition, or intuition absent any sensation, and often only speaks in passing of the sense perception of physical bodies (for example A20–1/B35). However, Kant more clearly links the five senses with intuition in his 1798 work Anthropology from a Pragmatic Point of View, in the section entitled “On the Five Senses.”

Sensibility in the cognitive faculty (the faculty of intuitive representations) contains two parts: sense and the imagination…But the senses, on the other hand, are divided into outer and inner sense (sensus internus); the first is where the human body is affected by physical things, the second is where the human body is affected by the mind (An 7:153).

Kant characterizes intuition generally in terms of two characteristics—namely immediacy [Unmittelbarkeit] and particularity [Einzelheit] (cf. A19/B33, A68/B93; JL 9:91). This is in contrast to the mediacy and generality [Allgemeinheit] characteristic of conceptual representation (A68/B93; JL 9:91).

Kant contrasts the particularity of intuition with the generality of concepts in the “stepladder” passage. Specifically, Kant says a concept is related to its object via “a mark, which can be common to many things” (A320/B377). This suggests that intuition, in contrast to concepts, puts a subject in cognitive contact with features of an object that are unique to particular objects and are not had by other objects. Some debate whether the immediacy of intuition is compatible with an intuition’s relating to an object by means of marks, or whether relation by means of marks entails mediacy and, thus, that only concepts relate to objects by means of marks. See Smit (2000) for discussion. Spatio-temporal properties seem like excellent candidates for such features, as no two objects of experience can have the very same spatio-temporal location (B327-8). But perhaps any non-repeatable, non-universal feature of a perceived object will do. For relevant discussion see Smith (2000); Grüne (2009), 50, 66-70.

Though Kant’s discussion of intuition suggests that it is a form of perceptual experience, this might seem to clash with his distinction between “experience” [Erfahrung] and “intuition” [Anschauung]. In part, this is a terminological issue. Kant’s notion of an “experience” is typically quite a bit narrower than our contemporary English usage of the term. Kant actually equates, at several points, “experience” with “empirical cognition” (B166, A176/B218, A189/B234), which is incompatible with experience being falsidical in any way. He also gives indications that experience, in his sense, is not something had by a single subject. See, for example, his claim that there is only one experience (A230/B282-3).

Kant also distinguishes intuition from “perception” [Wahrnehmung], which he characterizes as the conscious apprehension of the content of an intuition (Pr 4:300; cf. A99, A119-20, B162, and B202-3). “Experience,” in Kant’s sense, is then construed as a set of perceptions that are connected via fundamental concepts that Kant entitles the “categories.” As he puts it, “Experience is cognition through connected perceptions [durch verknüpfte Wahrnehmungen]” (B161; cf. B218; Pr 4:300).

Empirical intuition, perception, and experience, in Kant’s usage of these terms, all denote kinds of “experience” as we use the term in contemporary English. At its most primitive level, empirical intuition presents some feature of the world to the mind in a sensory manner. Empirical intuition does so in such a way that the intuition’s subject is in a position to distinguish that feature from others. A perception, in Kant’s sense, requires awareness of the basis by which the feature is different from other things. Kant uses the term in a variety of ways, however—JL 9:64-5, for instance—so there is some controversy surrounding the proper understanding of this term. One has a perception, in Kant’s sense, when one can not only discriminate one thing from another, or between the parts of a single thing, based on a sensory apprehension of it, but also can articulate exactly which features of the object or objects that distinguish it from others. For instance, one can say it is green rather than red, or that it occupies this spatial location rather than that one. Intuition thus allows for the discrimination of distinct objects via an awareness of their features, while perception allows for an awareness of what specifically distinguishes an object from others. “Experience,” in Kant’s sense, is even further up the cognitive ladder (see JL 9:64-5), insofar as it indicates an awareness of features, such as the substantiality of a thing, its causal relations with other beings, and its mereological features, that is  part-whole dependence relations.

Kant thus believes that the capacity to cognitively ascend from mere discriminatory awareness of one’s environment (intuition), to an awareness of those features by means of which one discriminates (perception), and finally to an awareness of the objects which ground these features (experience), depends on the kinds of mental processes of which the subject is capable.

Before turning to the issue of mental processing, which figures centrally in Kant’s overall critical project, there are two further faculties of the mind that are worth discussion— the faculties of judgment imagination. These faculties are not obviously as fundamental as the faculties of sensibility, understanding, and reason, but they nevertheless play a central role in Kant’s thinking about the structure of the mind and its contributions to our experience of the world.

ii. Imagination and Judgment

Kant links the faculty of imagination closely to sensibility. For example, in his Anthropology he says,

Sensibility in the cognitive faculty (the faculty of intuitive representations) contains two parts: sense and the power of imagination. The first is the faculty of intuition in the presence of an object, the second is intuition even without the presence of an object. (An 7:153; cf. 7:167; B151; LM 29:881; LM 28:449, 673)

The contrast Kant makes here is not entirely obvious, but includes at least the difference between cases of occurrent sensory experience of a perceived object—seeing the brown table before you—and cases of sensory recollection of a previously perceived object—visually imagining the brown table that was once in front of you. Kant makes this clearer in the process of further distinguishing between different kinds of imagination.

The power of imagination (facultas imaginandi), as a faculty of intuition without the presence of the object, is either productive, that is, a faculty of the original presentation [Darstellung] of the object (exhibitio originaria), which thus precedes experience; or reproductive, a faculty of the derivative presentation of the object (exhibitio derivativa), which brings back to mind an empirical intuition that it had previously (An 7:167).

So, in the operation of productive imagination, one brings to mind a sensory experience that is not itself based on any object previously so experienced. This is not to say the productive imagination is totally creative. Kant explicitly denies (An 7:167) that the productive imagination has the power to generate wholly novel sensory experience. It could not, in a person born blind, produce the phenomenal quality associated with the experience of seeing a red object, for example. If the productive imagination is instrumental in producing sensory fictions, the reproductive imagination is instrumental in producing sensory experiences of previously perceived objects.

Imagination thus plays a central role in empirical cognition by serving as the basis for both memory and the creative arts. In addition it also plays a kind of mediating role between the faculties of sensibility and understanding. Kant calls this mediating role a “transcendental function” of the imagination (A124). It mediates and transcends by being tied in its functioning to both faculties. On one hand, it produces sensible representations, and is thus connected to sensibility. On the other hand, it is not a purely passive faculty but rather engages in the activity of bringing together various representations, as does memory, for example, .Kant explicitly connects understanding with this kind of active mental processing.

Kant also goes so far as to claim that the activity of imagination is a necessary part of what makes perception, in his technical sense of a string of connected, conscious sensory experiences, possible (A120, note). Though Kant’s view concerning the exact role of imagination in sensory experience is contested, two points emerge as central. First, Kant belives imagination plays a crucial role in the generation of complex sensory representations of an object (see Sellars (1978) for an influential example of this interpretation). It is imagination that makes it possible to have a sensory experience of a complex, three-dimensional, and geometric figure whose identity remains constant even as it is subject to translations and rotations in space. Second, Kant regards imagination’s mediating role between sensibility and understanding as crucial for at least some kinds of concept application (see Guyer (1987) and Pendlebury (1995) for further discussion). This mediating role involves what Kant calls the “schematization” of a concept and an additional mental faculty, that of judgment.

Kant defines the faculty of judgment as “the capacity to subsume under rules, that is, to distinguish whether something falls under a given rule” (A132/B171). However, he spends comparatively little time discussing this faculty in the first Critique. There, it seems to be discussed as an extension of the understanding in that it applies concepts to empirical objects. It is not until the third Critique—Kant’s 1790 Critique of Judgment—that Kant distinguishes judgment as an independent faculty with a special role. There Kant specifies two different ways it might function (CJ 5:179; cf. CJ (First Introduction) 20:211)

In one, judgment subsumes given objects under concepts, which are themselves already given. This role appears identical to the role he assigns judgment in the Critique of Pure Reason. The basic idea is that judgment functions to assign an intuited object—a dog—to the correct concept—such as domestic animals. This concept is presumed to be one already possessed by the subject. In this activity, the faculty overlaps with the role Kant singles out for imagination in the section of the first Critique entitled ‘On the Schematism of the Pure Concepts of the Understanding.’ Both are conceived of here in terms of the ultimate functioning of understanding, since it is understanding that generates concepts.

The second role for the faculty of judgment, and what seems to make it a distinctive faculty in its own right, is that of finding a concept under which to “subsume” experienced objects. This is called judgment’s “reflecting” role (CJ 5:179). Here, the subject exercises judgment in generating an appropriate concept for what is given by intuition (CJ (First Introduction) 20:211-13; JL 9:94–95; for discussion see Longuenesse (1998), 163–166 and 195–197; Ginsborg (2006).

In addition to the generation of empirical concepts, Kant also describes reflective judgment as responsible for scientific inquiry. It must sort and classify objects in nature into a hierarchical taxonomy of genus/species relationships. Kant also utilizes the notion of reflective judgment to unify the otherwise seemingly unrelated topics of the Critique of Judgment—aesthetic judgments and teleological judgments concerning the order of nature.

Thus far, the discussion of Kant’s view of the mind has focused primarily on the various mental faculties and their corresponding representational output. Both the faculty of imagination and that of judgment operate on representations given from sensibility and understanding. In general, Kant conceives of the mind’s activity in terms of different methods of “processing” representations.

b. Mental Processing

Kant’s term for mental processing is “combination” [Verbindung], and the form of combination with which he is primarily concerned is what he calls “synthesis.” Kant characterizes synthesis as that activity by which understanding “runs through” and “gathers together” representations given to it by sensibility in order to form concepts, judgments, and ultimately, for any cognition to take place at all (A77-8/B102-3). Synthesis is not something people are typically aware of doing. As Kant says, it is a “a blind though indispensable function of the soul…of which we are only seldom even conscious (A78/B103)”.

Synthesis is carried out by the unitary subject of representation upon representations either given to the subject by sensibility or produced by the subject through thought. Intellectual synthesis occurs when synthesis is used on representations and forms the content of a concept or judgment. When carried out by the imagination on material provided by sensibility, it is called “figurative” synthesis (B150-1). In the Critique of Pure Reason, Kant is primarily concerned with synthesis performed on representations provided by sensibility, and he discusses three central kinds of synthesis—apprehension, reproduction (or imagination), and recognition (or conceptualization) (A98-110/B159-61). Though Kant discusses these forms of synthesis as if they were discrete types of mental acts, it seems that the first two forms must occur together, while the third only may occur as well (compare Brook (1997); Allais (2009).

One of the central topics of debate in the interpretation of Kant’s views on synthesis is whether Kant endorses conceptualism. Roughly, conceptualism claims the capacity for conscious sensory experience of the objective world depends, at least in part, on the repertoire of concepts possessed by the experiencing subject, insofar as those concepts are exercised in acts of synthesis by understanding.

Kant typically contrasts synthesis with other ways in which representations might be related, most importantly, by association (for example B139-40). Association is primarily a passive process by which the mind comes to connect representations due to repeated exposure of the subject to certain kinds of regularities. One might, for example, associate thoughts of chicken soup with thoughts of being ill, if one only had chicken soup when one was ill. In contrast, synthesis is a fundamentally active process that depends upon the mind’s spontaneity and is the means by which genuine judgment is possible.

Consider, for example, the difference between the merely associative transition between holding a stone and feeling its weight compared to the judgment that the stone is heavy (B142). The association of holding the stone and feeling its weight is not yet a judgment about the stone, but a kind of involuntary connection between two states of oneself. In contrast, thinking the stone is heavy moves beyond associating two feelings to a thought about how things are objectively, independent of one’s own mental states (Pereboom (1995), Pereboom (2006)). One of Kant’s most important points concerning mental processing is that association cannot explain the possibility of objective judgment. What is required, he says, is a theory of mental processing by an active subject capable of acts of synthesis.

Several of the important differences between synthesis and association can be summarized as follows (Pereboom (1995), 4-7):

  1. The source of synthesis is to be found in a subject, and the subject is distinct from its states.
  2. Synthesis can employ a priori concepts, concepts independent of experience, as modes of processing representations, whereas association never does.
  3. Synthesis is the product of a causally active subject. It is produced by a cause that is realized in the subject’s faculty, either the imagination or the understanding.

Kant’s conception of synthesis and judgment is tied to his conception of “consciousness” [Bewußtsein] and “self-consciousness” [Selbstbewußtsein]. However, both notions require some significant unpacking.

2. Consciousness

The notion of consciousness [Bewußtsein] plays an important role in Kant’s philosophy. There are, however, several different senses of “consciousness” in play in Kant’s work, not all of which line up with contemporary philosophical usage. Below, several of Kant’s most central notions and their differences from and relations to contemporary usage are explained.

a. Phenomenal Consciousness

Philosophical discussions of consciousness typically focus on phenomenal consciousness, or “what it is like” to have a conscious experience of a particular kind, such as seeing the color red or smelling a rose. Such qualitative features of consciousness have been of major concern to philosophers of the late 20th Century. However, the metaphysical issue of phenomenal consciousness is almost entirely ignored by Kant, perhaps because he is unconcerned with problems stemming from commitments to naturalism or physicalism. He seems to attribute all qualitative characteristics of consciousness to sensation and what he calls “feeling” [Gefühl] (CJ 5:206). Kant distinguishes between sensation and feeling in terms of an objective/subjective distinction. Sensations indicate or present features of objects, distinct from the subject. Feelings, by contrast, present only states of the subject to consciousness. Kant’s typical examples of such feelings include pain and pleasure (B66-7; CJ 5:189, 203-6).

Kant clearly assigns a cognitive role to sensation and allows that it is “through sensation” that we cognitively relate to objects given in sensibility (A20/B34). Despite that, he does not focus in any substantive or systematic way on the phenomenal aspects of sensory consciousness, nor does he focus on how exactly they aid in cognition of the empirical world.

b. Discrimination and Differentiation

The central notion of “consciousness” with which Kant is concerned is that of discrimination or differentiation. This is the same conception of consciousness mostly used in Kant’s time, particularly by his major predecessors Gottfried Wilhelm Leibniz (1646–1716) and Christian Wolff (1679-1754), and Kant gives little indication that he departs from their general practice.

According to Kant, any time a subject can discriminate one thing from another, the subject is, or can be, conscious of that one thing. (An 7:136-8). Representations which allow for discrimination and differentiation are “clear” [klar]. Representations which allow not only for the differentiation of one thing from others (such as differentiating one person’s face from another’s), but also the differentiation of parts of the thing so discriminated (such as differentiating the different parts of a person’s face) are called “distinct” [deutlich].

Kant does seem to deny the Leibniz-Wolff tradition that clarity can simply be equated with consciousness (B414-15, note). Primarily, he seems motivated to allow that one’s discriminatory capacities may outrun one’s capacity for memory or even the explicit articulation of that which is discriminated. In such cases, one does not have a fully clear representation.

Kant’s conception of “obscure” [dunkel] representation is that it allows the subject to discriminate differentially between aspects of her environment without any explicit awareness of how she does so. This connects him with the Leibniz-Wolff tradition of recognizing the existence of unconscious representations (An 7:135-7). Kant says the majority of representations that people appeal to in order to explain the complex, discriminatory behaviors of living organisms are “obscure” in a technical sense. Likening the mind to a map Kant goes so far as to say,

The field of sensuous intuitions and sensations of which we are not conscious, even though we can undoubtedly conclude that we have them; that is, obscure representations in the human being (and thus also in animals), is immense. Clear representations, on the other hand, contain only infinitely few points of this field which lie open to consciousness; so that as it were only a few places on the vast map of our mind are illuminated. (An 7:135)

Thus, obscure representations, have no direct or non-inferential awareness but must be posited to explain our fine-grained, differential, and discriminatory capacities. They constitute the majority of the mental representations with which the mind busies itself.

Though Kant does not make it explicit in his discussion of discrimination and consciousness, it is clear that he takes the capacity to discriminate between objects and parts of objects to be ultimately based on sensory representation of those objects. His views on consciousness as differential discrimination intersect with his views on phenomenal consciousness. Because humans are receptive through their sensibility, the ultimate basis on which we differentially discriminate between objects must be sensory. Thus, though Kant seems to take for granted the fact that conscious beings are in states with a particular phenomenal character, it must be the clarity and distinctness of this character that allows a conscious subject to differentially discriminate between the various elements of her environment (see Kant’s discussion of aesthetic perfection in the 1801 Jäsche Logic, 9:33-9 for relevant discussion).

c. Self-Consciousness

As the discussion of unconscious representation indicates, Kant believes we are not directly aware of most of our representations. They are nevertheless, to some degree, conscious, because they allow differential discrimination of elements from the subject’s environment. Kant thinks the process of making a representation clear, or fully conscious, requires a higher-order representation of the relevant representation. In other words, it requires that someone can have representations based on representations. As Kant says, “consciousness is really the representation that another representation is in me” (JL 9:33). Because this higher-order representation is one of another representation in the subject, Kant’s position here suggests that consciousness requires at least the capacity for self-consciousness. This position is reinforced by Kant’s famous claim in the Transcendental Deduction of the Critique of Pure Reason:

The I think must be able to accompany all my representations; for otherwise something would be represented in me that could not be thought at all, which is as much as to say that the representation would either be impossible or else at least would be nothing for me. (B131-2; emphasis in the original)

Kant might give the impression here of saying that for representation to be possible for a subject, the subject must possess the capacity for self-ascribing her representations. If so, then representation, and thus the capacity for conscious representation would depend on the capacity for self-consciousness. Because Kant ties the capacity for self-consciousness to spontaneity (B132, 137, 423) and restricts spontaneity to the class of rational beings, the demand for self-ascription would seem to deny that any non-rational animal (for example, dogs, cats, and birds), could have phenomenal or discriminatory consciousness.

However, there is little evidence to show that Kant endorses the self-ascription condition. Instead, he distinguishes between two distinct modes in which one is aware of oneself and one’s representations—inner sense and apperception (See Ameriks (2000) for extensive discussion). Only the latter form of awareness seems to demand a capacity for self-ascription.

i. Inner Sense

Inner sense is, according to Kant, the means by which we are aware of alterations in our own state. Hence all moods, feelings, and sensations, including such basic alterations as pleasure and pain, are the proper subject matter of inner sense. Ultimately, Kant argues that all sensations, feelings, and those representations attributable to a subject must ultimately occur in inner sense and conform to its form—time (A22-3/B37; A34/B51).

Thus, to be aware of something in inner sense is to be minimally, phenomenally conscious, at least in the case of awareness of sensations and feelings. To say a subject is aware of her own states via inner sense is to say that she has a temporally ordered series of mental states, and is phenomenally conscious of each, though she may not be conscious of the series as a whole. This could still count as a kind of self-awareness, as when an animal is aware of being in pain. But it is not an awareness of subject as a self. Kant himself indicates such a position in a letter to his friend and former student Marcus Herz in 1789.

[Representations] could still (I consider myself as an animal) carry on their play in an orderly fashion, as connected according to empirical laws of association, and thus they could even have influence on my feeling and desire, without my being aware of my own existence [meines Daseins unbewußt] (assuming that I am even conscious of each individual representation, but not of their relation to the unity of representation of their object, by means of the synthetic unity of their apperception). This might be so without my cognizing the slightest thing thereby, not even what my own condition is (C 11:52, May 26, 1789).

Hence, according to Kant, one may be aware of one’s representations via inner sense, but one is not and cannot, through inner sense alone, be aware of oneself as the subject of those representations. That requires what Kant, following Leibniz (1996), calls “apperception”.

ii. Apperception

Kant uses the term “apperception” to denote the capacity for the awareness of some state or modification of one’s self as a state. For one capable of apperception, there is a difference between feeling pain, and thus having an inner sense of it, and apperceiving that one is in pain, and thus ascribing, or being able to ascribe, a certain property or state of mind to one’s self. For example, while a non-apperceptive animal is aware of its own pain and its awareness is partially explanatory of its behavior, like avoidance, Kant construes the animal as incapable of making any self-attribution of its pain. Kant thinks of such a mind as incapable of construing itself as a subject of states, and it is thus unable to construe itself as persisting through changes of those states. This is not necessarily to say an animal incapable of apperception lacks any subject or self. But, at the very least, such an animal would be incapable of conceiving or representing itself in this way (See Naragon (1990); McLear (2011).

Kant considers the capacity for apperception as importantly tied to the capacity to represent objects as complexes of properties attributable to a single underlying entity (for example, an apple as a subject of the complex of the properties red and round). Kant’s argument for this connection is notorious both for its complexity and for its obscurity. The next sub-section will give an overview, though not an exhaustive discussion, of some of Kant’s most important points concerning these matters, as they relate to the issue of apperception.

d. Unity of Consciousness and the Categories

In order to better understand Kant’s views on apperception and unity of consciousness, one must step back and look at the wider context of the argument in which he situates these views. One of the core projects of Kant’s most famous work, the Critique of Pure Reason, is to provide an argument for the legitimacy of a priori knowledge of the natural world. Though Kant’s conception of the a priori is complex, Kant shares one central aspect of his view with his German rationalist predecessors (for example Leibniz (1996), preface), that we have knowledge of universal and necessary truths concerning aspects of the empirical world (B4-5). Those truths include one saying every event in the empirical world has a cause (B231). This tradition tended to explain the possession of knowledge of such universal and necessary truths by appeal to innate concepts which could be analyzed to yield the relevant truths. Kant importantly departs from the rationalist tradition, arguing that not all knowledge of universal and necessary truths is acquired via the analysis of concepts (B14-18). Instead, he says there are some “synthetic” a priori truths that are known on the basis of something other than conceptual analysis. Thus, according to Kant, the activity of pure reason achieves relatively little on its own. All of our ampliative knowledge (knowledge that can’t be directly deduced) that is also necessary and universal consists in what Kant calls “synthetic a priori” judgments or propositions. He then pursues the central question: how is knowledge of such synthetic a priori propositions possible?

Kant’s basic answer to the question of synthetic a priori knowledge involves what he calls the “Copernican Turn.” According to the “Copernican Turn,” the objects of human knowledge must “conform” to the basic faculties of human knowledge—the forms of intuition (space and time) and the forms of thought (the categories).

Kant thus engages in a two-part strategy for explaining the possibility of such synthetic a priori knowledge. The first part consists of arguing that the pure forms of intuition provide the basis for our synthetic a priori knowledge of mathematical truths. Mathematical knowledge is synthetic because it goes beyond mere conceptual analysis to deal with the structure of, or our representation of, space itself. It is a priori because the structure of space is accessible to us as it is merely the form of our intuition and not a real mind-independent thing.

In addition to the representation of space and time, Kant also thinks that possession of a particular, privileged set of a priori concepts is necessary for knowledge of the empirical world. But this raises a problem. How can an a priori concept, which is not itself derived from any particular experience, be nevertheless legitimately applicable to objects of experience? Even more difficult, it is not the mere possibility applying a priori concepts to objects of experience that worries Kant, for this could just be a matter of pure luck. Kant wants more than mere possibility; he wants to show that a privileged set of a priori concepts apply necessarily and universally to all objects of experience and do so in a way that people can know independently of experience.

This brings us to the second part of Kant’s argument, which is directly relevant for understanding Kant’s views on the importance of apperception. Not only must objects of knowledge conform to the forms of intuition, they also must conform to the most basic concepts (or categories) governing our capacity for thought. Kant’s strategy shows how a priori concepts legitimately apply to their objects by being partly constitutive of the objects of representation. This contrasts with the traditional view, according to which the objects of representation were the source or explanatory ground of our concepts (B, xvii-xix). Now, exactly what this means is deeply contested, in part because it is rather unclear what Kant intends by his doctrine of Transcendental Idealism. Does Kant intend that the objects of representation are themselves nothing other than representations? This would be a form of phenomenalism similar to that offered by Berkeley. Kant, however, seems to want to deny that his view is similar to Berkeley’s, asserting instead that the objects of representation exist independently of the mind, and that it is only the way that they are represented that is mind-dependent (A92/B125; compare Pr 4:288-94).

Kant’s strategy attempts to validate the legitimacy of the a priori categories proceeds by way of a “transcendental argument.” It takes  the conditions necessary for consciousness of the identity of oneself as the subject of different self-attributed mental states and ties them together with those necessary for grounding the possibility of representing an object distinct from oneself. From those conditions, various properties may be predicated. In this sense, Kant argues that the intellectual representation of subject and object stands and falls together. Kant thus denies the possibility of a self-conscious subject, who could conceptualize and self-ascribe her representations, but whose representations could not represent law-governed objects in space, and thus the material world or ‘nature’ as the subject conceives of it.

Though Kant’s views regarding the unity of the subject are contested, there are several points which can be made fairly clearly. First, Kant conceives of all specific, intellectual activity, including the most basic instances of discursive thought, as requiring what he calls the “original unity of apperception” (B132). This unity, as original, is not itself brought about by some mental act of combining representations, but, as Kant says, is “what makes the concept of combination possible” (B131). It is itself the ground of the “possibility of the understanding” (B131).

Second, the original unity of apperception requires whatever form of self-consciousness characteristically relates to the “I think.” As Kant famously says, “the I think must be able to accompany all my representations” (B131). Moreover, the “I think” essentially involves activity on the part of the subject—it is an expression of the subject’s free activity or “spontaneity” (B132). This means that, according to Kant, only beings capable of spontaneous activity—self-initiated activity that is ultimately traced to causes outside the reach of natural causal laws—are going to be capable of thought in the sense with which Kant is concerned.

Third, and related to the previous point, Kant seems to deny that a subject could attain the kind of representational unity characteristic of thought if her only resources were aggregative methods. Kant makes this point later in the Critique when he says, “representations that are distributed among different beings (for instance, the individual words of a verse) never constitute a whole thought (a verse)” (A 352). William James provides a vivid articulation of the idea: “Take a sentence of a dozen words, and take twelve men and tell to each one word. Then stand the men in a row or jam them in a bunch, and let each think of his word as intently as he will; nowhere will there be a consciousness of the whole sentence” (James (1890), 160). Kant construes consciousness as the “holding-together” of the various components of a thought. He does so in a manner that seems radically opposed to any conception of unitary thought which tries to explain it in terms of some train or succession of its components (Pr 4:304; see Kitcher (2010); Engstrom (2013) for contrasting treatments of this issue).

The exact content of Kant’s argument for the connection between subject and object in the Transcendental Deduction is highly disputed, and it is likely no single reconstruction of the argument can capture all the points Kant supports in the Deduction. At least one strand of Kant’s argument in the first half of the Deduction focuses on Kant’s denial that the unity of the subject and its powers of representational combination could be accounted for by a merely associationist (or Humean) conception of mental combination, sometimes termed his “argument from above” (see A119; Carl (1989); Pereboom (1995)). Kant’s argues (see Pereboom (2009)):

  1. I am conscious of the identity of myself as the subject of different self-attributions of mental states.
  2. I am not directly conscious of the identity of this subject of different self-attributions of mental states.
  3. If (1) and (2) are true, then this consciousness of identity is accounted for indirectly by my consciousness of a particular kind of unity of my mental states.
  4. Therefore, this consciousness of identity is accounted for indirectly by my consciousness of a particular kind of unity of my mental states. (1, 2, 3)
  5. If (4) is true, then my mental states indeed have this particular kind of unity.
  6. This particular kind of unity of my mental states cannot be accounted for by association. (5)
  7. If (6) is true, then this particular kind of unity of my mental states is accounted for by synthesis by a priori
  8. Therefore, this particular kind of unity of my mental states is accounted for by synthesis by a priori concepts. (6, 7)

Premise (1) says that I am aware of herself  as the subject of different states (or at least able to be so aware). For example, right now I might be hungry as well as sleepy. Previously, I was sleepy and slightly bored. Premise (2) claims I have no immediate or direct awareness of the being which has all of these states. In Kant’s terms, I lack any intuition of the subject of such self-ascribed states, instead having intuition only of the states themselves. Nevertheless, I am aware of all these states as related to a subject (it is I who am bored, hungry, sleepy), and it is in virtue of these connections that I can call one and all of these states mine. Hence, as premise (3) argues, there must be some unity to my mental states which accounts for my (indirect) awareness of their unity. My representations must have some basis for which they go together, and it is the basis for their ‘togetherness’ that explains how I can consider them, one and all, to be mine. Premises (4) and (5) unpack this point, and premise (6) argues that association could not account for such unity (the theory of association was articulated in a particularly influential form by David Hume (1888, Hume (2007)) and the reader should look to that article for relevant background discussion).

Kant’s point, in premise (6) of the above argument, is that forces of association acting on mental representations, whether impressions or ideas, cannot account for either the experience of a train of representations as mine or for the “togetherness” of those representations, both as a single thought or as a series of inferences. Hume argues we have no impression and thus no ensuing idea of an empirical self (Hume (1888), I.iv.6). Kant also accepts this point when he says, “the empirical consciousness that accompanies different representations is by itself dispersed and without relation to the identity of the subject” (B133). By this, Kant means that when we introspect in inner sense, all we ever get are particular mental states, such as  boredom, happiness, particular thoughts. We lack any intuition of a subject of those mental states. Hume concludes that the idea of a persisting self which grounds all of these mental states as its subject must be fictitious. Kant disagrees. His contrasting view takes the mineness and togetherness of one’s introspectible mental states as data needing explanation.Because an associative, psychological theory like that of Hume’s cannot explain these features of first-person consciousness (see Hume (1888), III. Appendix), we need to find another theory, such as Kant’s theory of mental synthesis.

Recall that, prior to the argument of the Transcendental Deduction, Kant links the operations of synthesis to possession of a set of a priori concepts, or categories, not derived from experience. Hence, in arguing that synthesis is required to explain the mineness and togetherness of one’s mental states, and by linking synthesis to the application of the categories, Kant argues we could not have the experience of the mineness and togetherness of our mental states without applying the categories.

While this argument is only half of Kant’s argument in the first part of the Deduction, it shows how tightly Kant took the connection to be between the capacities for spontaneity, synthesis and apperception, and the legitimacy of the categories. The other half, by the way, consists of an “argument from below,” and discerns the conditions necessary for the representation of unitary objects, see Pereboom (1995), (2009)According to Kant, there is only one possible explanation of one’s apperceptive awareness of one’s psychological states as one’s own and of all states being related to one another. As the subject of such states, one possesses a spontaneous power for synthesizing one’s representations according to general principles or rules, the content of which is given by pure a priori concepts—the categories. The fact that the categories play such a fundamental role in the generation of self-conscious psychological states is thus a powerful argument demonstrating their legitimacy.

Given that Kant leverages certain aspects of our capacity for self-knowledge in his argument for the legitimacy of the categories, the extent to which he argues for radical limits on our capacity for self-knowledge may be surprising. In the final section, Kant’s arguments concerning our capacity for a priori knowledge of the self and its fundamental features will be made clear. However, the next section will look at one of the central debates in Kant’s interpretation of the role of concepts in perceptual experience.

3. Concepts and Perception

During the discussion of synthesis above, conceptualism was characterized as claiming there is a dependent relation between a subject having conscious sensory experience of an objective world and the repertoire of concepts possessed by the subject and exercised by her faculty of understanding.

As a first pass at sharpening this formulation, understand conceptualism as a thesis consisting of two claims: (i) sense experience has correctness conditions determined by the ‘content’ of the experience, and (ii) the content of an experience is a structured entity whose components are concepts.

a. Content and Correctness

An important background assumption governing the conceptualism debate construes mental states as related to the world cognitively, as opposed to merely causally, if and only if they possess correctness conditions. That which determines the correctness condition for a state is that state’s content (see Siegel (2010), (2011); Schellenberg (2011)).

Suppose, for example, that an experience E has the following content C:

C: That cup is white.

This content determines a correctness condition V:

V: S’s experience E is correct if and only if the cup visually presented to the subject as the content of the demonstrative is white and the content C corresponds to how things seem to the subject to be visually presented.

Here, the content of the experiential state functions much like the content of a belief state to determine whether the experience, like the belief, is or is not correct.

A state’s possession of content thus determines a correctness condition, through which the state can be construed as mapping, mirroring, or otherwise tracking aspects of the subject’s environment.

There are reasons for questioning whether Kant endorses the content assumption articulated above. Kant seems to deny several claims integral to it. First, in various places he explicitly denies that intuition, or the deliverances of the senses more generally, are the kind of thing which could be correct or incorrect (A293–4/B350; An §11 7:146; compare LL 24:83ff, 103, 720ff, 825ff). Second, Kant’s conception of representational content requires an act of mental unification (Pr 4:304; compare JL §17 9:101; LL 24:928), something which Kant explicitly denies is present in an intuition (B129-30; compare B176-7). This is not to deny that Kant uses a notion of “content,” in some other sense, but rather only that he fails to use it in the sense required by interpretations endorsing the content assumption (see Tolley (2014), (2013)). Finally, Kant’s “modal” condition of cognition, that it provides a demonstration of what is really actual rather than merely logically possible, seems to preclude an endorsement of the content assumption (B, xxvii, note; compare Chignell (2014)). However, for the purposes of understanding the conceptualism debate, assume Kant does endorse the content assumption. The question then is how to understand the nature of the content so understood.

b. Conceptual Content

In addition to the content assumption, conceptualism is defined as committed to a conception of intuition’s content being completely composed of concepts. Against this, Clinton Tolley (Tolley (2013), Tolley (2014)) has argued that the immediacy/mediacy distinction between intuition and concept entails a difference in the content of intuition and concept.

If we understand by ‘content’…a representation’s particular relation to an object…then it is clear that we should conclude that Kant accepts non-conceptual content. This is because Kant accepts that intuitions put us in a representational relation to objects that is distinct in kind from the relation that pertains to concepts. I argued, furthermore, that this is the meaning that Kant himself assigns to the term ‘content’. (Tolley (2013), 128)

Insofar as Kant often speaks of the ‘content’ [Inhalt] of a representation as consisting of a particular kind of relation to an object (Tolley (2013), 112; compare B83, B87), Tolley’s proposal thus gives ground for a simple and straightforward argument for a non-conceptualist reading of Kant. However, it does not necessarily prove that the content of what Kant calls an intuition is not something that would be construed by others as conceptual, in a wider sense of that term. For example, both pure—that, this—and complex demonstrative expressions—that color, this person—have conceptual form, and have been proposed as appropriate for capturing the content of experience (McDowell (1996), ch. 3; for discussion see Heck (2000)). Demonstratives are not, in Kant’s terms, ‘conceptual’ since they do not exhibit the requisite generality which, according to Kant, all conceptual representation must.

c. Conceptualism and Synthesis

If it isn’t textually plausible to understand the content of an intuition in conceptual terms, at least as Kant understands the notion of a concept, then what would it mean to say that Kant endorses conceptualism with regard to experience? The most plausible interpretation, endorsed by a wide variety of interpreters, reads Kant as arguing that the generation of an intuition, whether pure or sensory, depends at least in part on the activity of the understanding. On this way of carving things, conceptualism does not consist in the narrow claim that intuitions have concepts as contents or components. Instead, it consists in the broader claim that the occurrence of an intuition depends at least in part on the discursive activity of understanding. The specific activity of understanding is that which Kant calls ‘synthesis,’ the “running through, and gathering together” of representations (A99).

The conceptualist further argues that taking intuitions as generated via acts of synthesis, which are directed by or otherwise dependent upon conceptual capacities, provides some basis for the claim that whatever correctness conditions might be had by intuition must accord with the conceptual synthesis which generated them. This arguably fits well with Kant’s much quoted claim,

The same function that gives unity to the different representations in a judgment also gives unity to the mere synthesis of different representations in an intuition, which, expressed generally, is called the pure concept of understanding. (A79/B104-5)

The link between intuition, synthesis in accordance with concepts, and relation to an object is made even clearer by Kant’s claim in §17 of the B-edition Transcendental Deduction:

Understanding is, generally speaking, the faculty of cognitions. These consist in the determinate relation of given representations to an object. An object, however, is that in the concept of which the manifold of a given intuition is united. (B137; emphasis in the original)

However else we are to understand this passage, Kant here indicates that the unity of an intuition necessary for it to stand as a cognition of an object requires a synthesis by the concept ”object.” In other words, cognition of an object requires that intuition be unified by an act or acts of the understanding.

According to the conceptualist interpretation, one must understand the notion of a representation’s content as a relation to an object, which in turn depends on a conceptually guided synthesis. So we can revise our initial definition of conceptualism to read it as claiming (i) the content of an intuition is a kind of relation to an object, (ii) the relation to an object depends on a synthesis directed in accordance with concepts, and (iii) synthesis in accordance with concepts sets correctness conditions for the intuition’s representation of a mind-independent object.

d. Objections to Conceptualism

At the heart of non-conceptualist readings of Kant stands denial that mental acts of synthesis carried out by understanding are necessary for the occurrence of cognitive mental states of the type which Kant designates by the term “intuition” [Anschauung]. Though it is controversial as to what might be considered the “natural” or “default” reading of Kant’s mature critical philosophy, there are at least four considerations which lend strong support to a non-conceptualist interpretation of Kant’s mature work.

First, Kant repeatedly and forcefully states that in cognition there is a strict division of cognitive labor—objects are given by sensibility and thought via understanding:

Objects are given to us by means of sensibility, and it alone yields us intuitions; they are thought through the understanding, and from the understanding arise concepts (A19/B33; compare A50/B74, A51/B75–6, A271/B327).

As Robert Hanna has argued, when Kant discusses the dependence of intuition on conceptual judgment in the Analytic of Concepts, he specifically talks about cognition rather than what others would consider to be perceptual experience (Hanna (2005), 265-7).

Second, Kant characterizes the representational capacities characteristic of sensibility as more primitive than those characteristic of understanding, or reason, and he characterizes those capacities as a plausible part of what humans share with the rest of the animal kingdom (Kant connects the possession of a faculty of sensibility to animal nature in various places, for example A546/B574, A802/B830; An 7:196). For example, Kant’s distinction between the faculties of sensibility and understanding seems intended to capture the difference between the “sub-rational” powers of the mind that is shared with non-human animals and the “rational or higher-level cognitive powers” that are special to human beings. (Hanna (2005), 249; compare Allais (2009); McLear (2011))

If one were to deny that, according to Kant, sensibility alone is capable of producing mental states cognitive in character, then,  it would seem that any animal which lacks a faculty of understanding would thereby lack any capacity for genuinely perceptual experience. The mental lives of non-rational animals would thus, at best, consist of non-cognitive sensory states causally correlated with changes in the animal’s environment. Aside from an unappealing and implausible characterization of the animals’ cognitive capacities, this reading also faces textual hurdles (for relevant discussion of some of the issues in contemporary cognitive ethology see Bermúdez (2003); Lurz (2009); Andrews (2014), as well as the papers in Lurz (2011)).  Kant is on record in various places as saying that animals have sensory representations of their environment (CPJ 5:464; LM 28:449; compare An 7:212), that they have intuitions (LL 24:702), and that they are acquainted with objects though they do not cognize them (JL 9:64–5) (see Naragon (1990); Allais (2009); McLear (2011)).

Hence, if Kant’s position is that synthetic acts carried out by the understanding are necessary for the cognitive standing of a mental state, then Kant is contradicting fundamental elements of his own position in crediting intuitions or their possibility to non-rational animals.

Third, any position which regards perceptual experience as dependent upon acts of synthesis carried out by the understanding would presumably also construe the ‘pure’ intuitions of space and time as dependent upon acts of synthesis (see Longuenesse (1998), ch. 9; Griffith (2012)). However, Kant’s discussion of space, and, analogously, time, in the third and fourth arguments (fourth and fifth in the case of time) of the Metaphysical Exposition of Space in the Transcendental Aesthetic seems incompatible with such a proposed relation of dependence.

Kant’s point in the third and fourth arguments of the Metaphysical Exposition of space and time is that no finite intellect could grasp the extent and nature of space as an infinite whole via a synthetic process involving movement from representation of a part to representation of the whole. If the unity of the forms of intuition were itself something dependent upon intellectual activity, then this unity would necessarily involve the discursive, though not necessarily conceptual, running through and gathering together of a given multiplicity (presumably of different locations or moments) into a combined whole. Kant believes this is characteristic of synthesis generally (A99).

But Kant’s arguments in the Metaphysical Expositions require the fundamental basis of the representation of space and time does not proceed from a grasp of the multiplicative features of an intuited particular to the whole with those features. Instead, the form of pure intuition constitutes a representational whole that is prior to that of its component parts (compare CJ 5:407-8, 409).

Hence, Kant’s position is that the pure intuitions of space and time possess a unity wholly different from that given by the discursive unity of understanding (whether in conceptual judgment or the intellectual with imaginative synthesis of intuited objects). The unity of aesthetic representation—characterized by forms of space and time—has a structure in which the representational parts depend upon the whole. The unity of discursive representation—representation where the activity of understanding is involved—has a structure in which the representational whole depends upon its parts (see McLear (2015)).

Finally, there has been extensive discussion on the non-conceptuality of intuition in the secondary literature on Kant’s philosophy of mathematics. For example, Michael Friedman has argued that the expressive limitations of prevailing logic in Kant’s time required the postulation of intuition as a form of singular, non-conceptual representation (Friedman (1992), ch. 2; Anderson (2005); Sutherland (2008)). In contrast to Friedman’s view, Charles Parsons and Emily Carson argued that the immediacy of intuition, both pure and empirical, should be construed in a ‘phenomenological’ manner. Space in particular is understood on their interpretation as an original, non-conceptual representation, which Kant takes to be necessary for the demonstration of the real possibility of constructed, mathematical objects as required for geometric knowledge (Parsons (1964); Parsons (1992); Carson (1997); Carson (1999); compare Hanna (2002). For a general overview of related issues in Kant’s philosophy of mathematics, see Shabel (2006) and the works cited therein at p. 107, note 29.)

Ultimately, however, there are difficulties assessing whether Kant’s philosophy of mathematics can have relevance for the conceptualism debate. It is not obvious whether intuition must be non-conceptual in accounting for mathematical knowledge is incompatible with claiming that intuitions themselves are dependent upon a conceptually-guided synthesis.

The non-conceptualist reading clearly commits to allowing that sensibility alone provides, perhaps in a very primitive manner, objective representation of the empirical world. Sensibility is construed as an independent cognitive faculty, which humans share with other non-rational animals, and which is the jumping-off point for more sophisticated, conceptual representation of empirical reality.

The next and final section looks at Kant’s views regarding the nature and limits of self-knowledge and the ramifications of this for traditional rationalist views of the self.

4. Rational Psychology and Self-Knowledge

Kant discusses the nature and limits of our self-knowledge most extensively in the first Critique, in a section of the Transcendental Dialectic called the “Paralogisms of Pure Reason.” Here, Kant is concerned to criticize the claims of what he calls “rational psychology.” Specifically, he is concerned about the claim that we can have substantive, metaphysical knowledge of the nature of the subject, based purely on an analysis of the concept of the thinking self. As Kant typically puts it:

I think is thus the sole text of rational psychology, from which it is to develop its entire wisdom…because the least empirical predicate would corrupt the rational purity and independence of the science from all experience. (A343/B401)

There are four “Paralogisms.” Each argument is presented as a syllogism, consisting of two premises and a conclusion. According to Kant, each argument is guilty of an equivocation on a term common to the premises, such that the argument is invalid. Kant’s aim, in his discussion of each Paralogism, is to diagnose the equivocation, and explain why the rational psychologist’s argument ultimately fails. In so doing, Kant provides a great deal of information about his own views concerning the mind (See Ameriks (2000) for extensive discussion). The argument of the first Paralogism concerns knowledge of the self as substance; the second, the simplicity of the self; the third, the numerical identity of the self; and the fourth, knowledge of the self versus knowledge of things in space.

a. Substantiality (A348-51/B410-11)

Kant presents the rationalist’s argument in the First Paralogism as follows:

  1. What cannot be thought otherwise than as subject does not exist otherwise than as subject, and is therefore substance.
  2. Now a thinking being, considered merely as such, cannot be thought of as other than a subject.
  3. Therefore, a thinking being also exists only as such a thing, i.e., as substance.

Kant’s presentation of the argument is rather compressed. In more explicit form we can put it as follows (see Proops (2010)):

  1. All entities that cannot be thought of as other than a subjects are entities that cannot exist otherwise than as subjects, and therefore are substances. (All M are P)
  2. All entities that are thinking beings are entities that cannot be thought otherwise than as subjects. (All S are M)
  3. Therefore, all entities that are thinking beings are entities that cannot exist otherwise than as subjects, and therefore are substances (All S are P)

The relevant equivocation is in the term that occupies the ‘M’ place in the argument— “entities that cannot be thought otherwise than as subjects”. Kant specifically locates the ambiguity in the use of the term “thought” [Das Denken], which he claims concerns an object in general in the first premise. Thus, “thought” could be given in a possible intuition. In the second premise, the use of “thought” is supposed to apply only to a feature of thought and, thus, not to an object of a possible intuition (B411-12).

While it isn’t obvious what Kant means by this claim, it could be that. Kant takes the first premise to make a claim about the objects of thought. They exist as an independent subject or bearer of properties and cannot be conceived of as anything else. This is thus a metaphysical claim about what kinds of objects could really exist, which explains Kant’s reference to an “object in general” that could be given in intuition.

In contrast, premise (2) makes a merely logical claim concerning the role of the representation <I> in a possible judgment. Kant says one cannot use representation <I> in any place other than upon the subject. For example, while I can make the claim “I am tall,” I would make no sense to claim “the tall is I.”

Against the rational psychologist, Kant argues that one cannot make any legitimate inference from the conditions under which representation <I> may be thought, or employed in a judgment, to the status of the ‘I’ as a metaphysical subject of properties. Kant makes this point explicit when he says,

The first syllogism of transcendental psychology imposes on us an only allegedly new insight when it passes off the constant logical subject of thinking as the cognition of a real subject of inherence, with which we do not and cannot have the least acquaintance, because consciousness is the one single thing that makes all representations into thoughts, and in which, therefore, as in the transcendental subject, our perceptions must be encountered; and apart from this logical significance of the I, we have no acquaintance with the subject in itself that grounds this I as a substratum, just as it grounds all thoughts. (A350)

Kant thus argues that one should differentiate between different conceptions of “substance” and the role they play in thoughts concerning the world.

Substance0:

x is a substance0 if and only if the representation of x cannot be used as a predicate in a categorical judgment

Substance1:

x is a substance1 if and only if its existence is such that it can never inhere, or exist, in anything else (B288, 407)

The first conception of substance is merely logical or grammatical. The second conception is explicitly metaphysical. Finally, there is an even more metaphysically demanding usage of “substance” that Kant employs.

(Empirical) Substance2:

x is a substance2 if and only if it is a substance1 that persists at every moment (A144/B183, A182)

According to Kant, the rational psychologist attempts to move from claims about substance0 to the more robustly metaphysical claims characteristic of conceptions and uses of substance1 and substance2. However, without further substantive assumptions, which go beyond anything given in an analysis of the concept <I>, no legitimate inference can be made from our notion of a substance0 to either of the other conceptions of substance.

Because, Kant denies that humans have any intuition, empirical or otherwise, of themselves as subjects, they cannot  come to have any knowledge concerning what we are in terms of beings either substance1 or substance2. At least they cannot do so by reflecting on the conditions of thinking of themselves using first-person concepts. No amount of introspection or reflection on the content of the first-person concept <I> will yield such knowledge.

b. Simplicity (A351-61/B407-8)

Kant’s discussion of the proposed metaphysical simplicity of the subject largely depends on points he made in the previous Paralogism concerning its proposed substantiality. Kant articulates the Second Paralogism as follows:

  1. The subject, whose action can never be regarded as the concurrence of many acting things, is simple. (All A is B)
  2. The self is such a subject. (C is A)
  3. Therefore, the self is simple. (C is B)

Here, the equivocation concerns the notion of a “subject.” Kant’s point, as with the previous Paralogism, is that, from the fact that one’s first-person representation of the self is always a grammatical or logical subject, nothing follows concerning the metaphysical status of that representation’s referent.

Of perhaps greater interest in this discussion of the Paralogism of simplicity is Kant’s analysis of what he calls the “Achilles of all dialectical inferences” (A351). According to the Achilles argument, the soul or mind is known to be a simple, unitary substance, because only such a substance could think unitary thoughts. Called the “unity claim” (see Brook (1997)), it says:

(UC):

If a multiplicity of representations are to form a single representation, they must be contained in the absolute unity of the thinking substance. (A352)

Against UC, Kant argues that there is no reason to think the structure of a thought, as a complex of representations, isn’t mirrored in the complex structure of an entity that thinks thoughts. UC is not analytic, which is to say that there is no contradiction entailed by its negation. UC also fails to be a synthetic a priori claim, in that it follows neither from the nature of intuition’s forms, nor from categories. Hence, UC could only be shown to be true empirically, and because people have no empirical intuition of the self, people have no basis for thinking that UC must be true (A353).

Kant here makes a point similar to contemporary, functionalist accounts of the mind (see Meerbote (1991); Brook (1997)). Mental functions, including the unity of conscious thought, are consistent with a variety of different media in which functions are realized. Kant’s says there is no contradiction in thinking that a plurality of substances might succeed in generating a single, unified thought. Hence, we cannot know that the mind is such that it must be simple in nature.

c. Numerical Identity (A361-66/B408)

Kant articulates the Third Paralogism as follows:

  1. What is conscious of the numerical identity of its Self in different times, is, to that extent, a person. (All C is P)
  2. Now, the soul is conscious of the numerical identity of its Self in different times. (S is C)
  3. Therefore, the soul is a person. (S is P)

Rational psychologists’ interest in establishing the personality of the soul or mind stems from the importance of proving that not only would the mind persist after the destruction of its body, but also that this mind would be the same person as before, not just some sort of bare consciousness or worse (for example, existing only as a “bare monad”).

Kant here makes two main points. First, the rational psychologist cannot infer from the sameness of the first-person representation (the “I think”) or across applications of it in judgment to any conclusion concerning the sameness of the metaphysical subject referred to by that representation. Kant thus again makes a functionalist point. The medium in which a series of representational states inheres may change over time, and there is no contradiction in conceiving of a series of representations as being transferred from one substance to another (A363-4, note).

Second, Kant argues that we can be confident of the soul’s possession of personality by  virtue of apperception’s persistence. The relevant notion of “personality” here distinguishes between a rational being and an animal. While the persistence of apperception (the persistence of the “I think” as being able to attach to all of one’s representations) does not provide an apperceiving subject with any insight into the true metaphysical nature of the mind, it does provide evidence of the soul’s possession of an understanding. Animals, by contrast, do not possess an understanding but, at best, according to Kant, only an analogue thereof. As Kant says in the Anthropology,@

That man can have the I among his representations elevates him infinitely above all other living beings on earth. He is thereby a person […] that is, by rank and worth a completely distinct being from things that are the same as reason-less animals with which one can do as one pleases. (An 7:127, §1)

Hence, so long as a soul possesses the capacity for apperception, it will signal the possession of an understanding, and thus serves to distinguish the human soul from that of an animal (see Dyck (2010), 120).

d. Relation to Objects in Space (A366-80/B409)

Finally, the Fourth Paralogism concerns the relation between awareness of one’s own mind and one’s awareness of other objects distinct from oneself. Thus, it also deals with one’s mind and awareness of space. Kant describes the Fourth Paralogism as follows:

  1. What can be only causally inferred is never certain. (All I is not C)
  2. The existence of outer objects can only be causally inferred, not immediately perceived by us. (O is I)
  3. Therefore, we can never be certain of the existence of outer objects. (O is not C)

Kant locates the damaging ambiguity in the conception of “outer” objects. This is puzzling because it doesn’t play the relevant role as middle term in the syllogism. But Kant is quite clear that this is where the ambiguity lies and distinguishes between two distinct senses of the “outer” or “external”:

Trancendentally Outer/External:

A seperate existence, in and of itself.

Empirically Outer/External:

An existence in space.

Kant’s point here is that all appearances in space are empirically external to the subject who perceives or thinks about them, while nevertheless being transcendentally internal. Such spatial appearances do not have an entirely independent metaphysical nature, because their spatial features depend at least in part on our forms of intuition.

Kant then uses this distinction not only to argue against the assumption of the rational psychologist that the mind is better known than any object in space (famously argued by Descartes), but also against those forms of external world skepticism championed by Descartes and Berkeley. Kant identifies Berkeley with what he calls “dogmatic idealism” and Descartes with what he calls “problematic idealism” (A377). He defines them thus:

Problematic Idealism:

We cannot be certain of the existence of any material body.

Dogmatic Idealism:

We can be certain that no material body exists – the notion of a body is self-contradictory.

Kant brings two arguments to bear against the rational psychologist’s assumption about the immediacy of our self-knowledge, as well as these two forms of skepticism, with mixed results. The two arguments are from “immediacy” and “imagination.”

i. The Immediacy Argument

In an extended passage in the Fourth Paralogism (A370-1) Kant makes the following argument:

External objects (bodies) are merely appearances, hence also nothing other than a species of my representations, whose objects are something only through these representations, but are nothing separated from them. Thus external things exist as well as my self, and indeed both exist on the immediate testimony of my self-consciousness, only with this difference: the representation of my Self, as the thinking subject, is related merely to inner sense, but the representations that designate extended beings are also related to outer sense. I am no more necessitated to draw inferences in respect of the reality of external objects than I am in regard to the reality of the objects of my inner sense (my thoughts), for in both cases they are nothing but representations, the immediate perception (consciousness) of which is at the same time a sufficient proof of their reality. (A370-1)

It helps to understand the argument as follows:

  1. Rational Psychology (RP) privileges awareness of the subject and its states over awareness of non-subjective states.
  2. But, transcendental idealism entails that people are aware of both subjective and objective states, as they appear, in the same way—via a form of intuition.
  3. So, either both kinds of awareness are immediate or they are both mediate.
  4. Because awareness of subjective states is obviously immediate, then awareness of objective states must also be immediate.
  5. Therefore, we are immediately aware of the states or properties of physical objects.

Here, Kant displays what he takes to be an advantage of Transcendental Idealism. Because both inner and outer sense depend on intuition, there is nothing special about inner intuition that privileges it over outer intuition. Both are, as intuitions, immediate presentations of objects, at least as they appear. Unfortunately, Kant never makes clear what he means by the term “immediate” [unmittelbar]. This issue is much contested (see Smit (2000)). At the very least, he means to signal that awareness in intuition is not mediated by any explicit or conscious inference, as when he says that the transcendental idealist “grants to matter, as appearance, a reality which need not be inferred, but is immediately perceived” (A371).

It is not obvious that an external world skeptic would find this argument convincing, as part of the grip of such skepticism relies on the convincing point that things could seem to one just as they currently are, even if there really is no external world causing one’s experiences. This may just beg the question against Kant (particularly premise (2) of the above argument). Certainly, Kant seems to think that his arguments for the existence of pure intuitions of space and time in the Transcendental Aesthetic lend some weight to his position. Thus, Kant is not so much arguing for Transcendental Idealism here as explaining some of the further benefits that come when the position is adopted. He does, however, present at least one further argument against the skeptical objection articulated above—the argument from imagination.

ii. The Argument from Imagination

Kant’s attempt to respond to the skeptical worry that things might appear to be outside us while not actually existing outside us appeals to the role imagination would have to play to make such a possibility plausible (A373-4; compare Anthropology, 7:167-8).

This material or real entity, however, this Something that is to be intuited in space, necessarily presupposes perception, and it cannot be invented by any power of imagination or produced independently of perception, which indicates the reality of something in space. Thus sensation is that which designates a reality in space and time, according to whether it is related to the one or the other mode of sensible intuition.

What follows is a reconstruction of this argument.

  1. If problematic idealism is correct, then it is possible for one to have never perceived any spatial object but only to have imagined doing so.
  2. But imagination cannot fabricate—it can only re-fabricate.
  3. So, if one has sensory experience of outer spatial objects, then one must have had at least one successful perception of an external spatial object.
  4. Therefore, it is certain that an extended spatial world exists.

Kant’s idea here is that the imagination is too limited to generate the various qualities that people experience as instantiated in external physical objects. Hence, it would not be possible to simply imagine an external physical world without having been originally exposed to the qualities instantiated in the physical world. Ergo, the physical world must exist. Even Descartes seems to agree with this, noting in Meditation I that “[certain simple kinds of qualities] are as it were the real colours from which we form all the images of things, whether true or false, that occur in our thought” (Descartes (1984), 13-14). Though Descartes goes on to doubt our capacity to know even such basic qualities given the possible existence of an evil deceiver, it is notable that the deceiver must be something other than ourselves, in order to account for all the richness and variety of what we experience (however, see Meditation VI (Descartes (1984), 54), where Descartes wonders whether there could be some hidden faculty in ourselves producing all of our ideas).

Unfortunately, it isn’t clear that the argument from imagination gets Kant the conclusion he wants, for all that it shows is that there was at one time a physical world, which affected one’s senses and provided the material for one’s sense experiences. This might be enough to show that one has not always been radically deceived, but it is not enough to show that one is not currently being radically deceived. Even worse, it isn’t even clear that a physical world must exist to generate the requisite material for the imagination. Perhaps all that is needed is something distinct from the subject, something which is capable of generating in it the requisite sensory experiences, whether or not they are veridical. This conclusion is thus compatible with that “something” being Descartes’s evil demon, or in contemporary epistemology, with the subject’s being a brain in a vat. Hence, it is not obvious that Kant’s argument succeeds in refuting the skeptic. To the extent that he did refute the skeptic, it still does not show that there is a physical world, as opposed merely to the existence of something distinct from the subject.

e. Lessons of the Paralogisms

Beyond the specific arguments of the Paralogisms and their conclusions, they present us with two central tenets of Kant’s conception of the mind. First, we cannot move from claims concerning the character or role of the first-person representation <I> to claims concerning the nature of the referent of that representation. This is a key part of his criticism of rational psychology. Second, people do not have privileged access to themselves as compared with things outside them. Both the self (or its states) and external objects are on par with respect to intuition. This also means that they only have access to themselves as they appear, and not as they fundamentally, metaphysically, are (compare B157). Hence, according to Kant, self-awareness, just as much as awareness of anything distinct from the self, is conditioned by sensibility. Intellectual access to selves in apperception, Kant argues, does not reveal anything about one’s metaphysical nature, in the sense of the kind of thing that must exist to realize the various cognitive powers that Kant describes as characteristic of a being capable of apperception—a spontaneous understanding or intellect.

5. Summary

Kant’s conception of the mind, his distinction between sensory and intellectual faculties, his functionalism, his conception of mental content, and his work on the nature of the subject/object distinction, were all hugely influential. His work immediately inspired the German Idealist movement. He also became central to emerging ideas concerning the epistemology of science in the late 19th and early 20th centuries, in what became known as the “Neo-Kantian” movement in central and southern Germany. Though Anglophone interest in Kant ebbed somewhat in the early 20th century, his conception of the mind and criticisms of rationalist psychology were again influential mid-century via the work of “analytic” Kantians such as P.F. Strawson, Jonathan Bennett, and Wilfrid Sellars. In the early 21st century Kant’s work on the mind remains a touchstone for philosophical investigation, especially in the work of those influenced by Strawson or Sellars, such as Quassim Cassam, John McDowell, and Christopher Peacocke.

6. References and Further Reading

Quotations from Kant’s work are from the German edition of Kant’s works, the Akademie Ausgabe, with the first Critique cited by the standard A/B edition pagination, and the other works by volume and page. English translations belong to the author of this article article, though he has regularly consulted, and in most cases closely followed, translations from the Cambridge Editions. Specific texts are abbreviated as follows:

  • An: Anthropology from a Pragmatic Point of View
  • C: Correspondence
  • CPR: Critique of Pure Reason
  • CJ: Critique of Judgment
  • JL: Jäsche Logic
  • LA: Lectures on Anthropology
  • LL: Lecturs on Logic
  • LM: Lectures on Metaphysics
  • Pr: Prolegomena to any Future Metaphysics

a. Kant’s Works in English

The most used scholarly English translations of Kant’s work are published by Cambridge University Press as the Cambridge Editions of the Works of Immanuel Kant. The following are from that collection and contain some of Kant’s most important and influential writings.

  • Correspondence, ed. Arnulf Zweig. Cambridge: Cambridge University Press, 1999.
  • Critique of Pure Reason, trans. Paul Guyer and Allen Wood. Cambridge: Cambridge University Press, 1998.
  • Critique of the Power of Judgment, trans. Paul Guyer and Eric Matthews. Cambridge: Cambridge University Press, 2000.
  • History, Anthropology, and Education, eds. Günter Zöller and Robert Louden. Cambridge: Cambridge University Press, 2007.
  • Lectures on Anthropology, ed. and trans. Allen W. Wood and Robert B. Louden. Cambridge: Cambridge University Press, 2012.
  • Lectures on Logic, trans. J. Michael Young. Cambridge: Cambridge University Press, 1992.
  • Lectures on Metaphysics, ed. and trans. Karl Ameriks and Steve Naragon. Cambridge: Cambridge University Press, 2001.
  • Practical Philosophy, ed. Mary Gregor. Cambridge: Cambridge University Press, 1996.
  • Theoretical Philosophy 1755-1770, ed. David Walford. Cambridge: Cambridge University Press, 2002.
  • Theoretical Philosophy after 1781, eds. Henry Allison and Peter Heath. Cambridge: Cambridge University Press, 2002

b. Secondary Sources

  • Allais, Lucy. 2009. “Kant, Non-Conceptual Content and the Representation of Space.” Journal of the History of Philosophy 47 (3): 383–413.
  • Allison, Henry E. 2004. Kant’s Transcendental Idealism: Revised and Enlarged. New Haven: Yale University Press.
  • Ameriks, Karl. 2000. Kant and the Fate of Autonomy: Problems in the Appropriation of the Critical Philosophy. Cambridge: Cambridge University Press.
  • Anderson, R Lanier. 2005. “Neo-Kantianism and the Roots of Anti-Psychologism.” British Journal for the History of Philosophy 13 (2): 287–323.
  • Andrews, Kristin. 2014. The Animal Mind: An Introduction to the Philosophy of Animal Cognition. London: Routledge.
  • Bennett, Jonathan. 1966. Kant’s Analytic. Cambridge: Cambridge University Press.
  • Bennett, Jonathan. 1974. Kant’s Dialectic. Cambridge: Cambridge University Press.
  • Bermúdez, José Luis. 2003. “Ascribing Thoughts to Non-Linguistic Creatures.” Facta Philosophica 5 (2): 313–34.
  • Brook, Andrew. 1997. Kant and the Mind. Cambridge: Cambridge University Press.
  • Buroker, Jill Vance. 2006. Kant’s Critique of Pure Reason: An Introduction. Cambridge: Cambridge University Press.
  • Carl, Wolfgang. 1989. “Kant’s First Drafts of the Deduction of the Categories.” In Kant’s Transcendental Deductions, edited by Eckart Förster, 3–20. Stanford: Stanford University Press.
  • Carson, Emily. 1997. “Kant on Intuition and Geometry.” Canadian Journal of Philosophy 27 (4): 489–512.
  • Carson, Emily. 1999. “Kant on the Method of Mathematics.” Journal of the History of Philosophy 37 (4): 629–52.
  • Caygill, Howard. 1995. A Kant Dictionary. Vol. 121. London: Blackwell.
  • Chignell, Andrew. 2014. “Modal Motivations for Noumenal Ignorance: Knowledge, Cognition, and Coherence.” Kant-Studien 105 (4): 573–97.
  • Descartes, Rene. 1984. The Philosophical Writings of Descartes. Edited by John Cottingham, Robert Stoothoff, and Dugald Murdoch. Vol. 2. Cambridge: Cambridge University Press.
  • Dicker, Georges. 2004. Kant’s Theory of Knowledge : An Analytical Introduction. Oxford: Oxford University Press.
  • Dyck, Corey W. 2010. “The Aeneas Argument: Personality and Immortality in Kant’s Third Paralogism.” In Kant Yearbook, edited by Dietmar Heidemann, 95–122.
  • Engstrom, Stephen. 2013. “Unity of Apperception.” Studi Kantiani 26: 37–54.
  • Friedman, Michael. 1992. Kant and the Exact Sciences. Cambridge, MA: Harvard University Press.
  • Gardner, Sebastian. 1999. Kant and the Critique of Pure Reason. London: Routledge.
  • Ginsborg, Hannah. 2006. “Kant and the Problem of Experience.” Philosophical Topics 34 (1 and 2): 59–106.
  • Griffith, Aaron M. 2012. “Perception and the Categories: A Conceptualist Reading of Kant’s Critique of Pure Reason.” European Journal of Philosophy 20 (2): 193–222.
  • Grüne, Stefanie. 2009. Blinde Anschauung. Vittorio Klostermann.
  • Guyer, Paul. 1987. Kant and the Claims of Knowledge. Cambridge: Cambridge University Press.
  • Guyer, Paul. 2014. Kant. London: Routledge.
  • Hanna, Robert. 2002. “Mathematics for Humans: Kant’s Philosophy of Arithmetic Revisited.” European Journal of Philosophy 10 (3): 328–52.
  • Hanne, Robert. 2005. “Kant and Nonconceptual Content.” European Journal of Philosophy 13 (2): 247–90.
  • Heck, Richard G. 2000. “Nonconceptual Content and the ‘Space of Reasons’.” The Philosophical Review 109 (4): 483–523.
  • Hume, David. 1888. A Treatise of Human Nature. Edited by L A Selby-Bigge. Oxford: Clarendon Press.
  • Hume, David. 2007. An Enquiry Concerning Human Understanding. Edited by Peter Millican. Oxford: Oxford University Press.
  • James, William. 1890. The Principles of Psychology. New York: Holt.
  • Keller, Pierre. 1998. Kant and the Demands of Self-Consciousness. Cambridge: Cambridge University Press.
  • Kitcher, Patricia. 1993. Kant’s Transcendental Psychology. New York: Oxford University Press.
  • Kitcher, Patricia. 2010. Kant’s Thinker. New York: Oxford University Press.
  • Leibniz, Gottfried Wilhelm Freiherr. 1996. New Essays on Human Understanding. Edited by Jonathan Bennett and Peter Remnant. Cambridge: Cambridge University Press.
  • Longuenesse, Béatrice. 1998. Kant and the Capacity to Judge. Princeton: Princeton University Press.
  • Lurz, Robert W. 2011. Mindreading Animals: The Debate over What Animals Know About Other Minds. Cambridge, MA: MIT Press.
  • Lurz, Robert W., ed. 2009. The Philosophy of Animal Minds. Cambridge: Cambridge University Press.
  • McDowell, John. 1996. Mind and World: With a New Introduction. Cambridge, MA: Harvard University Press.
  • McLear, Colin. 2011. “Kant on Animal Consciousness.” Philosophers’ Imprint 11 (15): 1–16.
  • McLear, Colin. 2015. “Two Kinds of Unity in the Critique of Pure Reason.” Journal of the History of Philosophy 53 (1): 79–110.
  • Meerbote, Ralf. 1991. “Kant’s Functionalism.” In Historical Foundations of Cognitive Science, edited by J-C Smith, 161–87. Dordrecht: Kluwer Academic Publishers.
  • Naragon, Steve. 1990. “Kant on Descartes and the Brutes.” Kant-Studien 81 (1): 1–23.
  • Parsons, Charles. 1964. “Infinity and Kant’s Conception of the ‘Possibility of Experience’.” The Philosophical Review 73 (2): 182–97.
  • Parsons, Charles. 1992. “The Transcendental Aesthetic.” In The Cambridge Companion to Kant, edited by Paul Guyer, 62–100. Cambridge: Cambridge University Press.
  • Paton, H J. 1936. Kant’s Metaphysic of Experience. Vol. 1 & 2. London: G. Allen &amp; Unwin, Ltd.
  • Pendlebury, Michael. 1995. “Making Sense of Kant’s Schematism.” PPR 55 (4): 777–97.
  • Pereboom, Derk. 1995. “Determinism Al Dente.” Noûs 29: 21–45.
  • Pereboom, Derk. 2006. “Kant’s Metaphysical and Transcendental Deductions.” In A Companion to Kant, edited by Graham Bird, 154–68. Blackwell Publishing.
  • Pereboom, Derk. 2009. “Kant’s Transcendental Arguments.” Stanford Encyclopedia of Philosophy.
  • Proops, Ian. 2010. “Kant’s First Paralogism.” The Philosophical Review 119 (4): 449.
  • Schellenberg, S. 2011. “Perceptual Content Defended.” Noûs 45 (4): 714–50.
  • Sellars, Wilfrid. 1956. “Empiricism and the Philosophy of Mind.” Minnesota Studies in the Philosophy of Science 1: 253–329.
  • Sellars, Wilfrid. 1968. Science and Metaphysics: Variations on Kantian Themes. London: Routledge &amp; Keegan Paul.
  • Sellars, Wilfrid. 1978. “Berkeley and Descartes: Reflections on the Theory of Ideas.” In Studies in Perception, edited by P K Machamer and R G Turnbull, 259–311. Columbus: Ohio University Press.
  • Shabel, Lisa. 2006. “Kant’s Philosophy of Mathematics.” In The Cambridge Companion to Kant and the Critique of Pure Reason, edited by Paul Guyer, 94–128.
  • Siegel, Susanna. 2010. “The Contents of Perception.” In The Stanford Encyclopedia of Philosophy, edited by Edward N Zalta.
  • Siegel, Susanna. 2011. The Contents of Visual Experience. Oxford: Oxford University Press.
  • Smit, Houston. 2000. “Kant on Marks and the Immediacy of Intuition.” The Philosophical Review 109 (2): 235–66.
  • Strawson, Peter Frederick. 1966. The Bounds of Sense. London: Routledge.
  • Strawson, Peter Frederick. 1970. “Imagination and Perception.” In Experience and Theory, edited by Lawrence Foster and Joe William Swanson. Amherst: University of Massachusets Press.
  • Sutherland, Daniel. 2008. “Arithmetic from Kant to Frege: Numbers, Pure Units, and the Limits of Conceptual Representation.” In Kant and Philosophy of Science Today(1), edited by Michela Massimi, 135–64. Cambridge: Cambridge University Press.
  • Tolley, Clinton. 2013. “The Non-Conceptuality of the Content of Intuitions: A New Approach.” Kantian Review 18 (01): 107–36.
  • Tolley, Clinton. 2014. “Kant on the Content of Cognition.” European Journal of Philosophy 22 (2): 200–228.
  • Van Cleve, James. 1999. Problems from Kant. Oxford: Oxford University Press.
  • Wood, Allen W. 2005. Kant. Oxford: Blackwell Publishing.

 

Author Information

Colin McLear
Email: mclear@unl.edu
University of Nebraska
U. S. A.

Lucius Annaeus Seneca (c. 4 B.C.E.—65 C.E.)

SenecaThe ancient Roman philosopher Seneca was a Stoic who adopted and argued largely from within the framework he inherited from his Stoic predecessors. His Letters to Lucilius have long been widely read Stoic texts. Seneca’s texts have many aims: he writes to exhort readers to philosophy, to encourage them to continue study, to articulate his philosophical position, to defend Stoicism against opponents, to portray a philosophical life, and much more. Seneca also writes to criticize the social practices and values of his fellow Romans. He rejects and criticizes, among other things, the idea that death is an evil, that wealth is a good, that political power is valuable, that anger is justified. In Seneca’s philosophical texts, one finds a Stoic who attempts to live in accordance with the conclusions he reaches through philosophy. Though Seneca admits to falling short of this goal personally, his efforts have long been one of the attractions (though some have found these to be distractions) of his philosophical works.

Lucius Annaeus Seneca was born in Cordoba during the reign of Augustus. Because of his birth to a provincial nobleman of low rank, Seneca was quite removed from the workings of the powerful Roman elite, yet the course of his life would come to be shaped by his relationships—sometimes inimical, sometimes friendly—with the early Julio-Claudian Emperors. He was exiled by Claudius and then recalled. He was friend and tutor to Nero. This relationship itself eventually soured, and Seneca, under orders from Nero, committed suicide in 65 C.E.

Someone familiar with Seneca exclusively as a philosopher is likely to be shocked by the details of his personal life. How, one may wonder, should Seneca’s argument that poverty is not an evil be understood in light of the fact that Seneca was one of the wealthiest men in the world? And how should Seneca’s commitment to and claims about the value of living philosophically be understood in light of the fact that Seneca’s own life was riddled with controversy and intrigue? On the other hand, one familiar with Seneca’s life may well meet with wonder the philosophical positions to be found in his philosophical works. How, one may ask, could the person who had positioned himself as the advisor to the young and impressionable (ex hypothesi) Princeps of Rome be the same person who upholds the private life as superior to the public? How could a man whose life story seems impossible for any but the most flexible character be the author of texts upholding the value of integrity and self-mastery as against mastery by one’s circumstances? These and many other questions make a clear view of Seneca difficult. This article attempts to provide a general sense of Seneca’s life and works that can serve as a starting point for understanding Seneca’s legacy. The aim here is primarily to bring the difficulties into view, rather than to resolve them.

Table of Contents

  1. Life, Political Career, and Death
  2. Works and Thought
    1. Seneca and Stoicism
    2. Philosophical Substance and Literary Talent
    3. The Letters to Lucilius
    4. Anger, Grief, and the Therapy of Emotions
    5. Natural Philosophy
    6. Non-philosophical Works
    7. Criticism and Influence
  3. References and Further Reading
    1. Texts and Translations
    2. Secondary Literature

1. Life, Political Career, and Death

Although the general outline of Seneca’s life is known, that many details remain unknown is surprising given both Seneca’s fame during his lifetime and the volume of his writing. On many points of detail about his life, scholars must take into consideration the available sources, some of which are from centuries after Seneca’s death and others which are hostile to his writings, and reconstruct a plausible account. Seneca’s birth is one of many such examples. Seneca was born in Cordoba, Spain. His father, Seneca the Elder, was a member of the Roman nobility whose family had immigrated to Spain. Seneca spent his earliest years with his mother Helvia at the family estates in Cordoba while his father was away in Rome. We do not know with certainty the year of Seneca’s birth, but the evidence from Seneca’s scant references to his own life suggest that he was born no earlier than 8 B.C.E. and no later than 1 B.C.E. Though some uncertainty is inescapable unless new evidence is discovered, the most common estimate for his birth is 4 B.C.E.

Seneca’s father, also Lucius Annaeus Seneca (the Elder), was a Roman nobleman of the equestrian class. The Elder’s enthusiasm for Roman politics and his enthusiasm for his two older sons’ potential in Roman society are plain in his Controversiae. Also plain is his insistence that the path for his middle son, our Seneca, was to be the normal cursus honorum (course of offices) and not the life of philosophical study. Seneca the Younger thus came to Rome very early, likely by age 5, to begin his training for Roman public life. Seneca’s early education is likely to have been typical of Roman elites at the time—focusing on language (both Greek and Latin) and traditional texts. Though his father would have been eligible for certain Roman offices, he seems instead to have devoted himself to forwarding the careers of his two oldest sons, Annaeus Novatus (later named Gallio upon adoption by L. Junius Gallio) and our Seneca. The Elder Seneca did not pressure his youngest son, Marcus Annaeus Mela, eventual father of Lucan, to pursue a political career.

Little is known with certainty about Seneca’s early life, particularly his personal life. Seneca presents himself in his philosophical works in a way that conceals personal details, however, in some cases, those he gives can provide helpful insight. His references, for example, to his former teachers—Attalus the Stoic, Fabianus the Sextian, and others—give some indication of his advanced training in philosophy and rhetoric. Scholars have found these references to his training, though sparse, crucial for understanding Seneca’s particular philosophical approach. Seneca does not, however, say enough about his personal experiences in Rome to help scholars in developing a robust biography. Further complicating matters is the fact that while Seneca is mentioned in histories from the ancient world, including those of Tacitus and Cassius Dio and the biographies of Suetonius, Seneca’s life as a whole is nowhere a topic of sustained focus.

We know that Seneca’s political career had a slow beginning. By the time Gaius (Calligula) Caesar died in 41 C.E., Seneca (now roughly 45 years old) had not yet advanced to the rank of Praetor, a rank for which he would have been eligible many years earlier. Seneca’s delayed progress or delayed entrance into the cursus honorum has been a matter of much research and speculation and has been explained by one or more of the following: Seneca’s recurring bouts of poor health, because of which he is thought to have spent a number of years in Egypt; his increasing interest in a philosophical, rather than public, life; his emerging reputation as a rhetorical talent; the tumultuous political environment during the time from Sejanus’ rise and fall until the ascension of Claudius in 41. Whatever the explanation, and whatever Seneca’s political ambitions may have been, they were stalled when, in 41, he was exiled by Claudius to the island of Corsica, where he would remain until 49.

Although Seneca’s guilt is not clearly attested in our sources, he was charged and convicted before the Senate for committing adultery with Julia Livilla, the sister of Gaius Caesar. Seneca tells us in the Consolation to Polybius (13.2) that he had been convicted and sentenced to death by the Senate but that Claudius had spared his life. Claudius’ intervention, perhaps, along with some other uncertainties about the case, suggest that the case against Seneca was, despite the Senate’s ruling, not decisive. The historian Cassius Dio (60.8.4, and Griffin, 32) argues that Seneca was essentially a casualty in an attempt by Messalina, Claudius’ wife, to be rid of Julia Livilla. On the other hand, Seneca was clearly a friend of Julia’s family. Her sister, Agrippina the Younger, would later be instrumental in reviving Seneca’s political career. Whatever the case, the occasion of Seneca’s exile marks the beginning of his involvement with the imperial family, which guides the course of his life thereafter.

Seneca’s exile ended with the help of Agrippina the Younger, now wife of Claudius, in 49 C.E. Upon Seneca’s return to Rome, he became the tutor of Agrippina’s son, the young Nero. Seneca’s role in Roman politics after his recall in 49 was largely unconventional. He was at first known as the ‘tutor’ (magister) of Nero and later became (along with Burrus) an influential advisor and speech-writer. In our records he is variously referred to as Nero’s ‘friend’ (amicus) and tutor. Neither of these titles had historically been associated with much political power, but it seems that Seneca likely played an important role in governing Rome, at least in the early years of Nero’s rule. It is difficult to know just which actions were taken on Seneca’s advice and which were not, though some ancient sources credit Seneca with the good policies and blame Burrus for the bad ones. Whatever the details of Seneca’s contribution, the first five years of Nero’s reign—the ‘quinquennium Neronis‘—have been noted for their successes. Here again, though, historians are divided on whether the successes of the first five years of Nero’s reign were genuine or merely successes in public relations, for which Seneca would have been well suited. As Nero matured, though, he began to rely less and less on Seneca’s advice. Eventually, Seneca was named as an associate in the failed Pisonian Conspiracy to overthrow Nero. In 65 C.E., Seneca was sentenced by Nero to commit suicide.

The circumstances of Seneca’s death are reported at length in Tacitus’ Annals (XV.60 ff.) and with less detail by both Cassius Dio and Seutonius. Indeed, Seneca’s death has been a topic of great intrigue and disagreement. Upon receiving word of his sentence, Seneca is reported to have acted calmly. He cut his wrists and legs to let his blood drain, but this proved ineffective because of his frail condition. He then took hemlock, which was also ineffective because of his poor circulation. He was then placed in a bath to improve his circulation and finally suffocated from the steam. As he had specified in his will, he was cremated without ceremony.

The setting and circumstances of Seneca’s death serve as a window into the difficulties of understanding the relation between his life and philosophical work. On the one hand his death seems to be modeled on that of Socrates in Plato’s Phaedo. His last moments are tranquil. He is described as being calm upon receiving the judgment of Nero and then meeting his death, which was, it seems, was preceded by dinner and conversation with his wife, Paulina, and friends. During the ordeal itself, he attempts to calm his friends by telling them to follow the “imago” (“pattern” or “image”) of his life. Seneca here likely means the image of a philosophical life that he has crafted in his works. But that picture of his life does not always fit comfortably with the rest of what we learn from our sources. Tacitus’ account of his death illustrates this. For while Seneca’s demeanor and actions remind us of Socrates’ death, the life that precedes this end bears little similarity to Socrates’. Seneca seems to have crafted a philosophical death, but in a context of great political intrigue. Whereas Socrates dies, at least partly, for his refusal to become involved in Athenian political affairs, Seneca dies, also at least partly, for the failure of his political maneuvers. Seneca seems to have known the sentence of death was coming. He may well have been involved, as alleged, in the Pisonian conspiracy. After his account of Seneca’s death, Tacitus reports a rumor that after the assassination of Nero, Piso was also to be put to death, and Seneca installed as princeps. Tacitus reports that Seneca is rumored to have known of this plan.

2. Works and Thought

Despite Seneca’s turbulent political career, he managed to produce and publish a great deal. His most famous and widely read works are his Letters to Lucilius. The Letters contain much that is of interest to students of Stoicism in general and have served for many as an entry point into Stoic philosophy. The Letters also show something of how Seneca thought philosophical principles could shape how one lives. In addition to the Letters, many other philosophical works—collected under the title ‘Dialogi’—survive. These treatises, some of which are incomplete, include three Consolations (Consolation to Marcia, Consolation to Helvia, Consolation to Polybius) and philosophical treatises on specific questions, topics, or themes (On Anger, On Mercy, On Leisure, On the Constancy of the Wise Person, On Providence, On Benefits). Seneca’s extended work, the Natural Questions, investigates various meteorological phenomena from the point of view of Stoic natural philosophy. In addition to his philosophical works, eight of Seneca’s tragedies survive along with a work that satirizes the deification of Claudius (The Apocoloycyntosis or ‘Pumpkinification’ of Claudius). It is known that Seneca wrote many other works that have been lost, including the public speeches that he wrote for Nero.

a. Seneca and Stoicism

Seneca’s philosophical outlook is best understood in terms of his particular circumstances. He, like many Roman philosophers of his time, was more interested in moral philosophy than in the other two branches of philosophy (dialectic, or logic, and physics) that had become standard in Hellenistic thinking about the parts of philosophy. While Seneca is clearly well-trained and widely read in all parts of philosophy, he chooses to focus on moral philosophy in his texts. With the exception of the Natural Questions, which is devoted entirely to the branch of philosophy called ‘physics’ (a branch that included natural philosophy as well as theology), much of Seneca’s work focuses on ethical matters. Also like other philosophers of his time, Seneca’s focus in moral philosophy has a clear practical emphasis. While discussions of theory and theoretical controversies abound in Seneca’s Letters and other works, his focus is consistently on how his theory—Stoicism—can be brought to bear on living one’s life. Seneca emphasizes the importance of this in Letter 89, where he encourages Lucilius (the addressee of the Letters) to indulge his wish to study logic so long has he refers everything that he learns to living a good life.

Seneca clearly sees himself as a Stoic. He commonly refers to the Stoic school as ‘ours’ and does much to defend the Stoics against certain Peripatetic and Epicurean attacks. Still, he is willing to disagree with the Stoics about certain matters in which he thinks a clearer or better argument is available. In Letter 33, for example, Seneca claims that he follows the teachings of the Stoics, but points out that the people who have discovered important truths in the past are not his masters (domini), but rather his guides (duces). Elsewhere, in his On Leisure, Seneca makes a similar point that he accepts the views of Zeno and Chrysippus (two early leaders of the Stoa) not just because Zeno or Chrysippus taught them, but because the arguments themselves lead to those positions.

He is also willing to make some concessions to the main adversary—the Epicurean. Seneca’s stance, especially toward Epicurus, has led readers to think that Seneca is best described as ‘eclectic’ rather than Stoic. His willingness to draw upon the philosophy of Epicurus, Plato, and others has seemed to some to betray the softness of his commitment to Stoicism. Seneca’s reply to this charge can be found in the passages from Letter 33 and On Leisure above. His focus is on the truth. He believes that, in some cases, the Epicurean or the Aristotelian has hit upon the truth. He is happy to acknowledge this to Lucilius and his readers but is nonetheless ready to point out that they have arrived at the truth for the wrong reasons. His treatise On Leisure illustrates this point. The question is whether the wise person ought to engage in public life or instead retire to pursue the work of retirement, which includes philosophical study. The Epicurean view is that the wise person will not engage in public life unless something interferes. The Stoic view is that the wise person will engage in public life unless something interferes. Seneca, though, argues that the importance of the projects of one’s private life (including the study of philosophy) can, in fact, trump the requirement to enter public life, even according to the Stoic view. This, he argues, shows that the pursuit of philosophical study and avoidance of public life are, in fact, recommended by the Stoics. The Epicureans’ overt call to avoid public life is mistaken, Seneca argues, because it assumes that a life devoted to politics cannot be harmonious with the philosophical life. Seneca concedes that in the actual world, as it is now, that is true, but points out that circumstances can change. In a world where public service would produce greater benefit to mankind than private, philosophical, work, a wise person would engage in the former.

Certain affinities between Seneca and his most famous fellow Roman philosophers—Marcus Aurelius and Epictetus – are commonly noted. All are concerned with the importance of living a philosophical life. All are, in the works that survive, more concerned with ethics than other branches of philosophy. These generalizations are accurate, but they obscure some features of Seneca’s philosophical works that distinguish him from these Roman Stoics. In particular, Seneca’s philosophical works were written for publication. In contrast, Epictetus did not write anything, and Marcus wrote for himself; Seneca, though, intended that his works be readthey were read widely during and after his lifetime.

A related and in some ways more significant feature of Seneca’s authorship is his decision not only to write for an audience, but to do so in Latin rather than Greek. In the generations both before and after Seneca, Greek remained the language of philosophical discourse. Two notable exceptions to this pattern are the Epicurean Lucretius’ epic poem De Rerum Natura (On the Nature of Things), and the philosophical works of Marcus Tullius Cicero. The efforts of Lucretius and Cicero to bring philosophy to Latin and to prove that Latin is sufficient for the task (a regular theme in Cicero’s works) largely failed. Seneca, however, does not seem to have had a goal of bringing philosophy to Latin. He has little interest, as Cicero did, in demonstrating that Latin could accommodate the Greek technical vocabulary. This has made Seneca’s texts particularly useless for those seeking to trace the history of particular terms or concepts through Classical and Hellenistic philosophy. On the other hand, Seneca’s approach makes it clear that he is not concerned with matters of concordance or with establishing or maintaining a particular paradigm of philosophical exposition. Seneca is, instead, doing philosophy in Latin (Inwood, 2005).

Though Seneca distinguishes himself from his peers in some respects, he nonetheless professes his allegiance to Stoicism. His commitment to the school can be seen most clearly in his frequent return to a number of core Stoic positions—particularly the positions defended in Stoic moral philosophy. The Stoic view of morality is distinguished from other Hellenistic and Classical philosophical schools by its commitment to the idea that an individual has absolute authority over her happiness. The Stoics reject the Aristotelian idea that one’s happiness (eudaimonia) is at least in part determined by things outside one’s control. Seneca stands with the Stoics in rejecting this view of happiness. He frequently returns to this theme in different contexts and emphasizes the importance of knowing what things are in one’s power and what things are not. Seneca agrees with the Stoics that virtue is sufficient for happiness. One’s virtue, unlike one’s circumstances, is within one’s power.

Knowledge of one’s nature is importantly connected, in Stoicism, with one’s knowledge of nature generally. Seneca often appeals to the importance of understanding nature in his works. He recommends, for example, that one who is setting off on a voyage say to himself that he will arrive at his destination unless something interferes. This statement is taken to reflect the understanding that whether one’s actions unfold as one wishes is not entirely within one’s control. Thus, Seneca urges that it would be a mistake to say “I will arrive at my destination.” Such a plan ignores the fact that many ships do not reach their destinations. The more one understands the nature of things, the more one understands what is in one’s power and what is not.

Indeed, the Stoics emphasize that to live well one must live according to nature. In Seneca’s texts, this emphasis provides the background for criticism of his culture and fellow Romans. To follow nature or live according to nature requires that one abandon many practices and values that have been taken up through acculturation. Seneca’s return throughout his philosophical writings to the dangers of public life, of crowds, and of social excesses relies on this point—that much of society is corrupt. To live as the mob supposes one should live is to stray from nature. Seneca notes, in Letter 46, that reason demands one live in accordance with one’s own nature, but this nature can be led astray.

b. Philosophical Substance and Literary Talent

Seneca’s literary talent was unmatched during his lifetime. His style appealed immediately to his Roman audience. Writing a generation after Seneca, Quintilian notes in his Institutiones that early in his career Seneca’s works were the only works being read. Quintilian’s treatment of Seneca’s texts is telling. In cataloguing the texts of other authors, he systematically omits Seneca’s contributions to each genre. Seneca’s works are given their own treatment because of their difficulty in being read judiciously. Quintilian praises Seneca’s works but recommends advanced training be completed prior to reading them.

With some modifications, this advice has been upheld by modern readers of Seneca. While he is often rated a philosophical amateur, no scholars would venture the similar claim about his literary talents. This realization, however, has led scholars of Seneca’s philosophical positions to take more care to understand the literary aims and constraints of his work. By all accounts, even from as early as Tacitus and Quitilian, Seneca’s prose style was both original and quite popular. His originality extends beyond the style of his sentences all the way to the organization of his philosophical treatises. He everywhere prefers a style of philosophical writing that more closely resembles conversation.

Seneca’s literary genius confronts readers of his text with a difficulty. Those interested in Seneca’s philosophy cannot simply ignore aspects of genre, style, and so on. For Seneca, these are importantly connected. Often the philosophical message of a treatise or letter is entangled with the norms of the genre in which he is working. At the same time, Seneca often presses against such norms to enlarge or bring into focus certain philosophical points. He claims, for example, that philosophical discourse can be appropriately undertaken as a conversation (Letter 75.1-2). To a great extent, Seneca’s philosophical texts reflect this preference: straightforward exposition is rare in his works. More frequently, his addressee is made to interrupt a point by asking a question or posing a challenge. In some cases, though, the demands of philosophical exposition require setting aside the genre’s norms. Seneca blames Lucilius, for example, in Letter 95 for its length and technical detail. This interplay between style and substance requires great care in interpreting Seneca’s philosophical achievements.

Seneca’s literary talents further complicate interpreting his philosophical works when one considers his controversial career. In some cases a careful interpretation of his work cannot ignore the immediate political context. The Apocolocyntosis, a scathing attack on Claudius, has clear political and public aims (though little of philosophical interest). His Consolation to Helvia, written to his mother during his exile, may well have been intended as a defense and request for recall. Similarly, he once mentions (Polybius 13.2) his trial and conviction, perhaps in an effort to remind Claudius of his innocence. These references to his own life, though rare, alert readers to the fact that his treatises may be constructed with many goals in mind: philosophical, but also personal, political, and literary. One can, for example, see the intermingling of aims in the opening passages of On Mercy, where Seneca praises Nero’s virtues. The praise of Nero’s character has both a philosophical and political goal: to encourage careful thinking about the importance, for a ruler, of cultivating mercy and to exhort the ruler of Rome to have mercy on those who may be thought to have wronged him.

c. The Letters to Lucilius

The Letters to Lucilius are Seneca’s most widely read and influential texts. The Letters contain much that is of interest to philosophers and to non-philosophers alike. 124 Letters have survived, divided into 20 books. It is likely that not all of the Letters have been preserved. The interpretation of Seneca’s Letters has been a matter of much disagreement among scholars.

The Letters themselves contain a wide variety of material ranging from apparently mundane discussions (for example, the dangers of crowds and public baths) to advanced technical discussions of Stoic theory. Seneca often makes use of something in everyday life to steer discussion to an ethical question or to some piece of moral advice. An over-arching interpretation of the Letters as a literary and philosophical work has eluded consensus among scholars. Still, a number of features of the Letters stand out as helpful for their interpretation. First, many groups of letters deal with common themes. Letters 5-10, for example, deal broadly with questions about living a philosophical life. Letters 94-5, the longest two letters of the work, deal with a technical question about the role of rules in moral reasoning. These are but two examples. There are few, if any, Letters the themes of which do not find echoes in others. Second, there is a noted trend as the letters progress toward longer, more technical, and more substantive philosophical discussions. This feature suggests that the Letters, aside from the apparently disparate themes and discussions along the way, also aim to demonstrate a philosophical education.

This aim is apparent early in the Letters. Seneca urges Lucilius in the first letter against the fault of wasting his time carelessly. In the second letter, he advises Lucilius on the correct approach to reading philosophical texts. In the fifth letter, he applauds Lucilius for persistence in his philosophical study but warns him to remain focused on the goal of philosophical study—that is, moral improvement—rather than the goal of many to simply make a show of philosophical talent. Seneca’s advice about philosophy—both how and what to study and how to apply it to one’s life—continues throughout the Letters. Scholars have long noted the apparent improvement of Lucilius as the Letters progress as evidence that Seneca means not simply to discuss philosophical progress but also to illustrate what it is like. The Lucilius of the early letters is not very sophisticated: the reader is made to suppose he is in the habit of requesting from Seneca pithy philosophical maxims to memorize. In Letter 33, Seneca chastises him for this and discontinues the practice of ending his letters with maxims. Later, in Letter 82, Seneca reports that he is happy with Lucilius’ progress. The later Letters also show Lucilius asking, apparently, more and more technical and difficult philosophical questions. Indeed, the later letters are, on the whole, considerably more philosophically rich than the early ones.

While Lucilius’ progress is arguably a theme that unites the letters, it is a theme that allows the philosophical discussions included in them to vary considerably. No one argument or position is systematically defended or articulated throughout the Letters as a whole. Instead, philosophical discussions are more localized, sometimes occupying the space of one letter, other times spanning a group of three or four. Sometimes a question addressed in one letter is picked up again much later. One can find in Seneca’s Letters various discussions of, to name a few, friendship, death, fate, poverty, moral theory, virtue, the good, argument, and much else. In all of his discussions, Seneca emphasizes the importance of being critical both of oneself and one’s way of living and of the received views, both popular and philosophical.

A brief account of the work’s first letter, though scarcely sufficient as a general introduction to the Letters, gives some indication of Seneca’s approach. The letter begins with some advice to Lucilius. He is to continue his efforts in devoting time to philosophical study. The theme of the Letter is just this—that too much time is wasted on worldly pursuits. Time flies, and as we delay what matters, life runs past. This theme is common in Latin literature: famous phrases like “tempus fugit” (from Vergil) and “carpe diem” (Horace) illustrate this. Seneca’s discussion of this offers no new philosophical insight. Still, as the letter continues, the philosophical point comes into view. The advice about wasting time generalizes to one’s life as a whole. To let one’s time slip away is to let oneself be occupied with things that are not really important. Seneca confesses that though he, too, wastes time, he has come to recognize when he is doing so. He counts this as progress and advises that Lucilius do what he can to keep what is really his.

As is typical of the Letters, this letter has Stoicism in view but does not heavy-handedly address or engage in Stoic theory. As a Stoic, Seneca is committed to the view that much of what one does in life is of little value. One’s day-to-day business contributes nothing to living a good life, unless one is considering the manner of his or her life. Seneca’s proposal that one should waste little, and be aware of what one is wasting, points to the Stoic view. What matters is acting virtuously, and this requires reflection on one’s actions. This is the first step to living well.

d. Anger, Grief, and the Therapy of Emotions

A defining principle of Stoicism is the claim that the mind is wholly rational, unlike Platonists and Aristotelians who posited a mind composed of both rational and non-rational parts. According to the Platonic/Aristotelian account of human psychology, emotions such as anger and fear could be explained by appeal to the non-rational parts of the mind, but on the Stoic view of the mind, no similar appeal can be made: Stoic theory suggests no non-rational aspects of the mind. The whole—unitary—mind is implicated in its actions. This feature of the Stoic theory has important implications for both its account of and its evaluation of emotions.

The Stoics view emotions as irrational movements of the mind. Since there are no non-rational parts of the mind, the Stoics understand a movement to be ‘irrational’ when it is contrary to right reason. Anger is a state in which one is not guided by correct reasoning. Fear is a state in which one is not guided by correct reasoning. And so on. Hence, emotions are states of mind that are contrary to right reason. One who is not angry would think and act differently than one who is. At least in the case of the perfect moral agent, these actions—that is, of someone who is not angry—would be fully guided by correct reasoning. The Stoics explain that the emotions arise when one assents to certain kinds of false statements about the world. Consider the following judgments one may make in response to having one’s car stolen:

S1: My car has been stolen.

S2: It is bad to have one’s car stolen.

S3: It is appropriate to respond to having one’s car stolen in an emotional way.

In an ordinary case, the Stoics claim, one’s episode of anger can be explained by appeal to these three propositions. One first encounters some state of affairs, articulates it, and assents to it—S1. One often goes on to form a secondary articulation, along the lines of S2, about the goodness or badness of this state of affairs. If one assents to this statement, one often continues to react in a way that somehow corresponds to the judgment reflected in S2. ‘S3’ is not exactly what one assents to. Instead, S3 is meant to capture something about the angry person’s response. Consider, for example, that an angry person might well scream “in anger” or do some violence to his surroundings or the like. The analysis of anger is meant to capture (via S3) this feature of anger (and other emotions).

According to Stoic theory, judgments of the form S2 and S3 are nearly always false. The Stoics hold that the only good is virtue and that the only evil is vice. All else is indifferent. According to this theory of value, having one’s car stolen is not bad; thus S2 is false. Similarly, since nothing bad has happened, the course of action sanctioned by S2 and S3 is illegitimate. No emotional response is appropriate.

Seneca devotes much of his philosophical work to advancing these aspects of Stoicism. The chief concern behind the Stoic theory of emotions and the theory of value is that until one removes such false beliefs about value, one will not succeed in living a happy life. It is with this that Seneca concerns himself in his philosophical work. He aims, for example, in On Anger to help his readers avoid becoming angry, and offers what little advice there is to help those who are angry stop being so. In the Consolations, he is concerned with helping his readers avoid the life shattering effects of grief. Elsewhere, Seneca works to help people let go of their fear of death.

In his Consolations in particular, as well as in his treatise On Anger and other works, Seneca is clearly more often concerned with helping people avoid experiencing emotions. As a Stoic, he is committed to the idea that emotional experiences involve false judgments. Still, Seneca does not typically concern himself with explicating the theory itself. While our reports from Greek doxagraphers and from Cicero preserve the outlines of the theory, Seneca feels no need to repeat it. One noteworthy exception to this is Seneca’s On Anger. Here (in Book II.1.4) Seneca explains the structure of an emotional experience. His explanation attempts to show that anger is voluntary despite the fact that one cannot entirely control the way things appear.

Seneca’s strategy is to explain anger in terms of three ‘movements’. The first movement, he says, is involuntary. It is the moment when the mind articulates some state of affairs—that ‘having my car stolen is a bad thing’. This may correspond, in some cases, with an elevated heart rate, a sinking feeling in one’s stomach, or the like. This initial experience is, Seneca claims, beyond one’s immediate control, but it is not anger. To be angry, one must “assent” to the proposition. That is, one must sanction the assertion that “such and such is a bad thing.” Once the assent is given, one is angry.

In distinguishing the first, involuntary, movement of anger from anger itself, Seneca seems to be responding (or reporting his source’s response) to an objection to the Stoic view. The Stoics claim that the wise person—the Sage—will not become angry (or experience any emotion) but cannot deny that the Sage will, for example, flinch at the loud bark of a dog or the sudden loud clap of thunder. Why, the objector may say, would the Sage flinch? To flinch is to assent to the proposition that something bad has happened. By separating the involuntary from the voluntary, Seneca answers this criticism.

While Seneca occasionally addresses theoretical matters in this way, he more commonly focuses on an issue—in this case, the emotions—from a different perspective. Seneca largely favors discussing issues from the perspective of the person who is making moral progress, rather than from the perspective of the wise person. This stands in contrast to the focus of other surviving Stoic texts which tend to focus on the morally perfect agent—the ‘Sage’—and her qualities. Those texts often characterize the Sage in a way that sets her very much apart from normal human beings. Seneca’s concern, however, is with the circumstances of those who are aspiring to be and do better.

This orientation can be seen very clearly in passages or whole works (like On Anger, Consolation to Marcia, and others) where he aims to help those who are imperiled by emotions. The aim of these works is not to point out that the Sage does not experience anger or grief, nor is the aim even primarily to say why the Sage does not experience these emotions. Instead, the aim is to appeal to those who are not wise and to offer them advice, informed of course by Stoic theory, to help them re-orient their thinking about their circumstances. In On Anger, for example, Seneca advises that an angry person look in the mirror. Clearly, this person will not find a Sage in the mirror. Instead, Seneca thinks, he will find something in his appearance that does not resonate well with his thinking about himself. Elsewhere, Seneca advises that the person who is grieving consider the difference an audience makes. When one finds that one grieves more in the presence of an audience, Seneca thinks this will force one to reflect on what the grief is really about. Is one’s grief, in other words, directed at the one who is gone or at oneself? These kinds of strategies for dealing with emotions are, in any case, very far removed from arguments about the value of the emotions and still further removed from theoretical accounts of the nature of the emotions. Seneca is convinced that the Stoic view is right, and he finds support for this conclusion in less theoretical, and more practical, aspects of human life.

e. Natural Philosophy

The received view of the Roman Stoics according to which the Romans were only concerned with ethics must be put aside in Seneca’s case. The opening lines of the Natural Questions articulate a view about the importance of physics that shows Seneca to be a clear exception. The very existence of the Natural Questions, one of Seneca’s longest philosophical treatises, shows this as well. He notes that “the difference between philosophy and other areas of study is as great as the difference, within philosophy itself, between the branch concerned with humans and the one concerned with the gods” (Praef.1, Hine, trans.). Seneca’s reference here to the branch concerned with the gods is a standard characterization of the ‘physics’, one of the three Hellenistic divisions of philosophy that Seneca inherits. For the Stoics, the study of physics, or natural philosophy, included the study of the divine. In Letter 88, Seneca claims that the liberal arts, here noted as the ‘other areas of study’, are only important insofar as they prepare the mind for philosophical study. Seneca’s claim at the beginning of the Natural Questions, then, suggests that all philosophical study ultimately aims at understanding of the gods. Even the “branch concerned with humans” (that is, ethics) has an aim beyond itself. According to the Stoic view, full moral progress requires a complete understanding of the nature of the divine. Seneca’s claims here, and elsewhere in the Natural Questions, suggest that he embraces the full range of Stoic philosophy despite the fact that most of his philosophical attention is devoted to matters central to the ‘branch concerned with humans.’

The outlines of Stoic physics are well documented in early sources. The Stoics are materialists, compatibilists, and theists. In the most general sense, the Stoics hold that the cosmos is entirely composed of matter but that certain forms of matter (fire, aether) are endowed with creative capacity. The human being’s mind is itself a composition of these elements. According to the Stoic view, the cosmos is a mind writ large, in the sense that the movements and developments in nature at the cosmic level are the result of guiding intelligence. For this reason, the Stoics regard “god,” “nature,” “fate,” “providence,” as roughly equivalent expressions. All refer to the active and creative element in the cosmos. To live according to nature ultimately requires that one come to adopt, or understand, the natural world from this cosmic perspective.

The surviving portions of Seneca’s Natural Questions are a survey of various meteorological phenomena undertaken in light of the broader Stoic understanding of the nature of the cosmos. Though the discussions are often narrowly focused on particular meteorological phenomena and their explanation, Seneca occasionally pauses to take a wider view. He considers, for example, the role that reflective surfaces (mirrors) play—and are supposed to play—in moral improvement (I.17 ff.). He explains the Stoic view that reason is the same for both gods and humans (Praef. 14). In a discussion of the cause of lightening (II.45), Seneca points to the Stoic view that “Jupiter,” “Providence,” “Fate,” and so on are all names for the active, divine element that shapes the universe.

The Natural Questions is an unfinished work. Passages like those above suggest that Seneca may have been revising or finishing the work with the aim of more carefully connecting his findings about meteorological phenomena to Stoic physics. They also suggest that, at least in some moments, Seneca may have been interested in providing a Stoic alternative to Lucretius’ explanation of many of the same phenomena in De Rerum Natura. The Stoic claim that the happenings of the natural world are guided by reason stands in stark contrast to the Epicurean view, articulated by Lucretius, that the world is generated and organized by chance.

f. Non-philosophical Works

Seneca wrote much besides his philosophical texts; however much of his work has been lost. Lost are all of his speeches, including those he penned for Nero. Also lost are some philosophical treatises, though some fragments survive from a treatise on marriage. The surviving non-philosophical works include the Apocolocyntosis, a work satirizing the deification of Claudius, and eight tragedies: Agamemnon, Hercules Furens, Medea, Thyestes, Oedipus, Phaedra, Phoenisse, and Troades. Scholars have long disagreed about the relation between Seneca’s philosophical prose and his tragic poetry. At one end of the spectrum, some ancient sources regarded the author of the tragedies to be a different Seneca altogether. While there is agreement now that our Seneca authored the tragedies, the relation between these works and his philosophical treatises is less agreed upon. On the one hand the tragedies are clearly concerned with many Stoic themes that Seneca addresses in his philosophical works. Despite this point of intersection, though, the tragedies do not seem to say the same about these themes. The most striking theme in this regard is the attention in the tragedies to the role of anger and other emotions. While the philosophical works (especially On Anger) attempt to persuade the reader to avoid becoming angry, the tragedies sometimes seem to elicit our sympathies for those who are angry and acting in anger. Similarly, as one commentator notes, the tragedies are rife with Stoic pronouncements (for example, “follow nature” Phaedra, 481) that are put forward in a manner inconsistent with the Stoic principles to which they give voice.

The Phaedra illustrates the second phenomena quite clearly. The title character, wife of Theseus, has fallen in love with her stepson, Hippolytus. After a failed effort to overcome her feelings for the boy, Phaedra’s cause of seducing Hippolytus is taken up by the Nurse, who agrees to help in order to prevent Phaedra’s suicide. The Nurse urges Hippolytus to “follow nature” as his guide. The Stoic imperative to follow nature is ordinarily understood as an injunction to live a life according to reason, to be virtuous, and to shun the circumstances of fortune. Here, though, the Nurse employs the phrase to encourage Hippolytus to do what most people do—namely, to pursue the pleasures of sex (Wilson, 2010). Hippolytus himself in this play seems, initially at least, to come closest to the Stoic ideal. In a long passage in Act II, he explains his love for the countryside and mountaintops, places in which he can be truly free from anger and other passions and from the vices that corrupt those who spend their time in society. Yet his peace comes at the price of seclusion and for the wrong reasons. The would-be sage seeks the isolation of the woods because of his hatred for all women. He notes that whether his hatred stems from “reason, nature, or passion” (567), it pleases him to hate them all.

The focus in the tragedies on the destructive force of emotions (especially anger) is plain. As one commentator notes, anger guides the action in all of Seneca’s plays (Wilson, 2010). In the Phaedra, Theseus’ anger at his son leads him to seek Hippolytus’ death. (Phaedra, whose advances were rejected by Hippolytus, has lied to her husband, accusing Hippolytus of raping her). In the Medea, Medea’s anger at Jason leads her to murder her own children. In the Thyestes, Atreus’ anger leads him to murder Thyestes’ children and feed them to him. While these portrayals of emotion forge a connection between the tragedies and the prose works, what that connection is remains unclear. How, for example, should one understand the significance of Phaedra’s, “What can reason do? Passion, passion rules!” (trans., Wilson) given Seneca’s claim elsewhere (On Anger II.1.4) that passions are voluntary?

Scholars have taken a number of positions on these issues. Some have argued that there is no connection between the tragedies and the philosophical works, while others have sought to show that the tragedies contain important philosophical lessons. Arguments of the latter kind are varied. Some have held that the tragedies are meant to illustrate the destructive influence of passions; others have argued that the tragedies should be read in light of Seneca’s Stoic metaphysics. These scholars emphasize the role of fate, providence, and divination in the tragedies. Finally, one scholar has argued that the guiding philosophical concern in the tragedies is epistemological (Staley, 2010). On this view, Seneca’s tragedies, offer a kind of ‘clarification’ of the cognitive processes of those who are under the sway of passions.

Whatever relation they are ultimately thought to have to his philosophical works, Seneca’s tragedies, his Apocolocyntosis, and his lost speeches serve to alert readers of his philosophical works to his literary talent. Scholars have rarely attempted a full account of all his works undertaken with the aim of clarifying or even producing an account of Seneca the author. The difficulty of such an undertaking suggests that caution is needed in assuming that Seneca is primarily a philosopher. Seneca appears to have been comfortable writing in many genres. His comfort, moreover, provides a further clue that Seneca’s life was either plagued by or fortunate in (depending on how one sees it) his constant contact with both philosophy and with the politics and culture of Rome.

g. Criticism and Influence

Both Seneca’s life and his works have been targets of criticism since his own lifetime, during which, of course, he was charged and convicted of both adultery and conspiracy. Though the evidence in neither of these cases is clearly decisive, they added to the growing criticism that Seneca’s way of life undermined his philosophical message. This criticism gained more traction from the fact that Seneca, who writes that poverty is not an evil, was one of the wealthiest men in the world. This criticism of Seneca was first made publicly by Publius Suilius, a political enemy of Seneca who was, according to Tacitus, angered by Nero’s revival of a law against pleading for money. Suilius, it seems, believed that this revival resulted from Seneca’s influence. Tacitus reports that Suilius taunted Seneca publicly, reminding the Roman elites of Seneca’s affair with Julia Livilla, and most importantly, asking the following question of his fellow Romans: “By what kind of wisdom or maxims of philosophy had Seneca within four years of royal favour ammassed three hundred million sesterces?” (Tacitus, Annals XIII.42, Church & Brodribb, trans.). Although little independent evidence exists to confirm Suilius’ claim about the extent of Seneca’s wealth or how he acquired it, this passage from Tacitus’ Annals has served as a source for many readers of Seneca since its publication. The result is that Seneca’s political enemy has in a way won the battle of public opinion. Scholars have noted that some caution is needed in evaluating this charge against Seneca, but the fact that Seneca was very wealthy and at the same time wrote that one should be content with what one has—and that poverty is, in itself, no evil—has been a lasting criticism.

This example denotes a broader line of criticism that Seneca is inconsistent. His wealth and his pronouncements about the value of poverty are but one example. To this can be added his praise of the philosophical life together with his recurrent involvement in Roman politics. Seneca is made, in Tacitus, to plead his case for retirement before Nero, yet Seneca is clearly (in both the Consolation to Helvia and to Polybius) eager to return to Rome during his period of exile. Seneca seems, then, to have little but praise for the philosophical life withdrawn from the business of Rome, yet cannot fully embrace that life himself. In his On Mercy, Seneca encourages the young emperor Nero to take to heart the point that while many may have the power to put others to death, he alone has the power to give life (that is, to allow life where the punishment of death is justified), yet Seneca may well have been party to Nero’s assassination of his own mother. At the very least, Seneca was unable to stop Nero. Again, Seneca upholds the importance of freedom from emotion in living a happy life. He encourages daily exercises to rid oneself of anger and other emotions, yet he writes tragedies in which unbridled emotions are the central focus. He encourages his readers to reflect on what is really theirs and to distance themselves from the inner workings of the political mob, yet he writes a political satire (the Apocolocyntosis) which assumes detailed knowledge of the inner workings of imperial court under Claudius. Finally, Seneca is reported to have written Nero’s address for the funeral of Claudius. While this work is lost to us, it is unlikely that it had much in common with the Pumpkinification of Claudius, which he must have penned at around the same time.

These features of Seneca’s life and work have been both targets for criticism and spurs for investigation. To his credit, Seneca denies (even in the Letters, some of his latest works) that he is close to living a fully philosophical life. He works toward this goal but falls short. Notwithstanding his own profession of philosophical failure, the spirit of his philosophical works seems clearly (to the extent that we can see clearly into his life) undermined by his role in Roman life. A number of views can be taken here. Perhaps Seneca simply fails to live the philosophical life he aspires to live. Perhaps his philosophical ambitions were really secondary to his political ambitions. While many scholars have noted the inconsistencies and many have rejected Seneca’s work on the grounds of hypocrisy, some scholars (notably Emily Wilson) have challenged this view. Wilson notes that, “The most interesting question is not why Seneca failed to practice what he preached, but why he preached what he did, so adamantly and so effectively, given the life he found himself leading” (Wilson, 2014).

A final and more philosophically substantive criticism also relies on a claim that there is some disparity between what Seneca advises and what Seneca does. This criticism, articulated by J. M. Cooper, argues that Seneca’s aim to guide his reader toward moral improvement is ultimately undermined by his advice to avoid the study of logic. Stoic theory requires that one have knowledge of ethics, physics, and logic. The Stoics, in fact, have much to say about the important interconnections among these three branches of study. Though one may begin with ethics, one’s philosophical study is simply not complete unless one has mastered the arguments forms that fall under the scope of logic. Despite this, Seneca repeatedly tells his readers, particularly in the Letters, that the study of advanced logic, including Zeno’s syllogisms and certain logical fallacies, are a waste of time. In doing so, Seneca is advising his readers to avoid something that is, according to his own theory, necessary for moral progress.

Despite these criticisms, Seneca’s works have been widely read since his own lifetime. Seneca’s works, along with Cicero’s, were much more readily accessible to medieval Europeans who no longer read Greek. Thus, Seneca served for a long time as one of only a few sources for Stoic philosophy. Seneca’s works were well received by Christian thinkers in the Early Middle Ages. This was no doubt partly due to the forged correspondence (long thought to be genuine) between Seneca and the apostle Paul. Partly though, Seneca’s acceptance by Christian thinkers was surely due to similarities between Christian and Stoic doctrines. Seneca’s doctrine of the first movements of emotions—those experiences of being drawn toward something or the initial experience that precedes becoming angry or grief stricken – find welcome reception in Christian thinkers who are working on accounts of temptation and the original failings of human nature.

During and after the Renaissance, Seneca’s works continued to be read widely. How much Seneca alone, apart from other surviving Stoic sources (including Cicero’s philosophical works), influenced a particular philosopher’s thinking is difficult to tell, but Seneca was clearly read. Descartes, for example, used Seneca’s On the Happy Life as the basis for the ethical view he develops in his correspondence with Princess Elizabeth. A near contemporary of Descartes, Justus Lipsius, relied on Seneca’s philosophy heavily in his attempt to develop a new form of Stoicism suitable to his age. One can find many references to Seneca in the works of philosophers throughout the history of philosophy in Europe. Seneca’s influence and importance can perhaps be seen most clearly in cases where philosophers identify with Seneca’s philosophical views and at the same time sympathize with the circumstances of his life. Thomas More, for example, who was also an advisor to a powerful monarch, read Seneca widely. It has been noted that one source for More’s Utopia was likely Seneca’s (incomplete) treatise De Otio (On Leisure). There, Seneca notes that the ideal state is “no place” (nusquam).

The influence of Seneca’s work, especially his account of the emotions and their therapy, can be seen in the work of philosophers such as Foucault and Pierre Hadot, who have both developed accounts of living philosophy. This includes focus on the source of one’s troubling emotions—anxiety, fear, anger—and how philosophy can address these. In psychology, the Stoic account of the emotions as cognitive has been influential in the development of cognitive therapies. Albert Ellis, for example, who developed rational emotive behavioral therapy (REBT), was heavily influenced by Stoic views of the emotions, and especially by Seneca.

3. References and Further Reading

a. Texts and Translations

All of Seneca’s works are available in English translation. For many years, the Loeb Series, which includes Latin and English side by side, translated by Gummere (Letters) and Basore (Dialogi or Moral Essays) were the standard English translations. New translations of particular works or selections of letters have been published. Inwood’s 2007 collection contains extensive philosophical commentary on a collection of 17 philosophically substantive Letters.

  • Campbell, Robin, trans. Seneca: Letters from a Stoic. Penguin Classics. 2004.
  • Inwood, Brad, trans. Seneca: Selected Philosophical Letters. Oxford; New York: Oxford University Press, 2010.
  • Seneca, Lucius Annaeus. Epistulae Morales (Letters). Trans. Richard M. Gummere. London: Harvard University Press, 1917. 3 vols. Loeb.
  • Seneca, Lucius Annaeus. Moral Essays. Trans. John W. Basore. Cambridge, Mass.: Harvard University Press, 1928. 3 vols. Loeb.
  • Seneca, Lucius Annaeus. Tragedies. Trans. John G. Fitch. Annotated edition. Cambridge, Mass: Harvard University Press, 2002. 2 vols. Loeb.
  • Wilson, Emily, trans. Seneca: Six Tragedies. Oxford World’s Classics. New York: Oxford University Press, 2010.

An effort to produce new translations of all of Seneca’s works is currently underway through the University of Chicago Press. As of 2015, the following four volumes were available.

  • Seneca, Lucius Annaeus. Anger, Mercy, and Revenge. Trans. Robert Kaster and Martha Nussbaum. Chicago: London: University of Chicago Press, 2010.
  • Seneca, Lucius Annaeus. Hardship and Happiness. Trans. Elaine Fantham et. al. Chicago ; London: University Of Chicago Press, 2014.
  • Seneca, Lucius Annaeus. Natural Questions. Trans. Harry Hine. Chicago; London: University Of Chicago Press, 2010.
  • Seneca, Lucius Annaeus. On Benefits. Trans. Miriam Griffin and Brad Inwood. Chicago: University Of Chicago Press, 2011.

b. Secondary Literature

  • Bartsch, Shadi and Wray, David, eds. Seneca and the Self. Cambridge: Cambridge University Press, 2009.
    • A collection of essays evaluating Seneca’s contribution to the modern notion(s) of the Self.
  • Cooper, John M. Knowledge, Nature, and the Good: Essays on Ancient Philosophy. Princeton University Press, 2009.
    • Chapter 12, “Moral Theory and Improvement: Seneca,” argues that Seneca’s dislike for logic is incompatible with his Stoic allegiance.
  • Fitch, John G., ed. Oxford Readings in Classical Studies: Seneca. New York: Oxford University Press, 2008.
    • A collection of essays on many aspects of Seneca’s work—both philosophical and poetic.
  • Griffin, Miriam T. Seneca: A Philosopher in Politics. Oxford: Oxford University Press, 1992.
    • An extensive study of what Seneca’s philosophical writings can tell us about his role as a political agent.
  • Hadot, Ilsetraut. Seneca und die Griechisch-Römische Tradition der Seelenleitung. Berlin: Walter De Gruyter & Co., 1969.
    • Places Seneca’s work as a spiritual advisor to his audience in the context of Greco-Roman spiritual advice literature from Homer to Seneca.
  • Inwood, Brad. Reading Seneca: Stoic Philosophy at Rome. Oxford; New York: Oxford University Press, 2008.
    • A collection of essays that explicate Seneca’s thinking about a number of philosophical problems.
  • Ker, James. The Deaths of Seneca. New York: Oxford University Press, 2009.
    • An examination of Seneca’s life and work through the lenses of the various accounts of his death, both ancient and later.
  • Romm, James. Dying Every Day: Seneca at the Court of Nero. New York, Vintage Books, 2014.
    • A biography aimed at reconciling the apparently incompatible versions of Seneca—the wealthy man who praises poverty, the philosopher who is so engaged in politics, and so forth. Romm focuses consistently on the role that death and thinking about death play in Seneca’s life and works.
  • Wilson, Emily. The Greatest Empire: A Life of Seneca. Oxford: Oxford University Press, 2014.
    • A biography of Seneca informed by what is known about the dates of his philosophical and non-philosophical works. Wilson aims to explain, as much as possible, various tensions in the reception of Seneca.
  • Volk, Katharina, and Gareth D. Williams. Seeing Seneca Whole: Perspectives on Philosophy, Poetry, and Politics. Brill, 2006.
    • A collection of essays from a variety of standpoints—philosophical, literary, historical—aimed at clarifying Seneca’s status as an author of many genres.

 

Author Information

Robert Wagoner
Email: wagonerr@uwosh.edu
University of Wisconsin Oshkosh
U. S. A.

Modal Logic: A Contemporary View

Modal notions go beyond the merely true or false by embedding what we say or think in a larger conceptual space referring to what might be or might have been, should be, or should have been, or can still come to be. Modal expressions occur in a remarkably wide range across natural languages, from necessity, possibility, and contingency to expressions of time, action, change, causality, information, knowledge, belief, obligation, permission, and far beyond. Accordingly, contemporary modal logic is the general study of representation for such notions and of reasoning with them.

Although the origins of this study lie in philosophy, since the 1970s modal logic has developed equally intensive contacts with mathematics, computer science, linguistics, and economics; and this circle of contacts is still expanding. But at the same time, in its technical development, modal logic has also become something more, starting from the discovery in the 1950s and 1960s of various translations taking modal languages into systems of classical logic. Investigation of modalities has also become a study of fine-structure of expressive power, deduction, and computational complexity that sheds new light on classical logics, and interacts with them in creative ways.

This article presents a panorama of modal logic today in the spirit of the Handbook of Modal Logic, emphasizing a shared mathematical modus operandi with classical logic, and listing themes and applications that cross between disciplines, from philosophy and mathematics to computer science and economics. While this style of presentation does not disown the metaphysical origins of modal logic, it views these as just one of many valid roads toward modal patterns of reasoning. Other roads traveled in this article run through other areas of philosophy, such as the epistemology of knowledge and belief, or even through other disciplines, such as logics of space in mathematics, or logics of programs, actions, and games in computer science and game theory.

Table of Contents

  1. Modal Notions and Reasoning Patterns: a First Pass
  2. A Very Brief History of Modal Logic
  3. The Basic System: Modality on Graphs
  4. Some Active Current Applications
  5. Modern Themes across the Field
  6. Modal Logic and Philosophy Today
  7. Coda: Modal Logic as a Part of Standard Logic
  8. Conclusion
  9. References and Further Reading

1. Modal Notions and Reasoning Patterns: a First Pass

Modal logic as a subject on its own started in the early twentieth century as the formal study of the philosophical notions of necessity and possibility, and this tradition is still very much alive in philosophy (Williamson 2013). In this article, however, we will paint on a larger canvas and introduce the reader to what modal logic as a field has become a century hence. Still, for a start, it is important to realize that modal notions have a long historical pedigree. They were already studied by Aristotle and then by the medieval logicians (Kneale & Kneale 1961), who noted many peculiarities of this province of reasoning. Often these studies started from raw inferential intuitions that can take several forms. We may judge some pattern to be valid (say, necessity implies truth), we may judge others to be invalid (truth does not imply necessity), or we may have ideas about connections between different modal notions, such as the necessity of some proposition \phi being equivalent to the impossibility that not \phi. Modal logicians then start by introducing notation to make all this crystal-clear. Say, writing \Box \phi for necessary truth of \phi and \Diamond \phi for its possibility, the above claims amount to, respectively,

  • \Box \phi \rightarrow \phi    is valid
  • \phi \rightarrow \Box \phi    is not valid
  • \Box \phi \leftrightarrow \neg\Diamond\neg \phi    is valid.

\neg is negation. Later on, through the work of C. I. Lewis (Lewis & Langford 1932), further philosophical notions were drawn into this family, in particular, that of entailment, or ‘strict implication’, between two propositions. Whereas the plain material implication \phi \rightarrow \psi expresses the fact that \phi and \neg \psi do not occur together:

    \[\neg(\phi \wedge \neg \psi)\]

an entailment is the stronger modal assertion that the two cannot occur together:

    \[\neg\Diamond(\phi \wedge \neg \psi).\]

With this notation, we can then swing into the second professional mode of logicians, operating on these forms by valid steps of abstract reasoning to discover new insights. For instance, a few simple inference steps with plausible modal principles will show that

    \[\neg\Diamond(\phi \wedge \neg \psi)\]

is equivalent to

    \[\Box(\phi \rightarrow \psi)\]

giving another view of entailment, this time as the necessity strengthening of the material implication. With such a proof calculus in hand, we can analyze many philosophical arguments from the classical and modern literature involving modality. Famous examples abound in the work of Arthur Prior, Peter Geach, Jaakko Hintikka, Stig Kanger, Saul Kripke, David Lewis, Robert Stalnaker, and other pioneers, all the way to the new wave of philosophical logicians of today. But we can also use a logical proof calculus, once we have settled on one, independently to reveal more of the abstract system of principles governing modality.

Still, this cozy world of intuitions and mathematical systems is not enough for many logicians. Why would some principles of modal reasoning be valid, while others are not? One common way of analyzing this further is by giving a semantic model for the meaning of the modalities that fits the earlier-stated facts. As it happens, such a model was given already centuries ago, following an idea going back to Leibniz and earlier thinkers like the Jesuit Luis Molina. We can explain the surplus of necessary truth over ordinary truth by going beyond the actual world s in terms of some larger universe W of metaphysically possible worlds. The truth of a statement \phi is truth at s only, while:

the necessity statement \Box \phi says that \phi is true in all possible worlds t.

Likewise, the existential modality \Diamond \phi says that is \phi true in at least one possible world. In this way, modalities become like standard universal and existential quantifiers, ranging over some suitably chosen larger family of worlds. Despite the existence of alternatives, and the occasional attack on the above framework, this quantifier view has been dominant since the 1950s, and it has influenced all that is to come in this article.

Note how this setting makes the interpretation of necessity relative to the choice of a model \mathbf{M} containing all relevant possible worlds, something that will return in our later formal modal truth definition, which employs a ternary format:

\mathbf{M}, s \vDash \phi

formula \phi is true in model \mathbf{M} at world s.

Despite the metaphysical terminology, often retained for nostalgic reasons, such models have very different interpretations today. ‘Worlds’ can stand for situations, stages of a process, information states, locations in space, or just abstract points in a graph. This trend toward exploring a wider spectrum of interpretations was reinforced by the addition, in the 1950s, of a crucial further parameter (by Kanger, Hintikka, Kripke, and Montague) that increased the reach of modal logic immensely. We give each world in a model \mathbf{M} a range of its ‘accessible worlds’, and then let ‘necessity’ (or whatever this notion turns into on a concrete interpretation) range only overall accessible worlds.

Defining modal notions somewhat loosely as those that look beyond the actual, here, and now, natural language is full of modality, since all our thinking and actions wade in a sea of possibilities, many of them never realized, but all-important to deliberation and decision, rational or otherwise. Explicitly modal linguistic expressions show a great variety: temporal (past, present, future), epistemic (know, believe, doubt, must), normative (may, ought), or causal, while there is also a lot of implicit modality, for instance in verbs like “seek” that can even refer to presumably (one more modal expression!) non-existent worlds containing a fountain of eternal youth. There is no good definition covering all these linguistic cases, though failure of substitution of extensional equivalents is often cited as a connecting thread. A ‘modal’ sentence operator can be sensitive to the substitution of propositions with the same truth value. This criterion looks a bit ‘symptom-based’, though, and perhaps a better criterion for spotting a modality is a semantic one of expressions whose truth value may have to look ‘beyond the actual facts’. But the stability of modality also shows in characteristic inference patterns, such as the many dualities instancing the earlier equivalence

\Box \phi = \neg\Diamond\neg \phi and its dual \Diamond \phi = \neg\Box\neg \phi.

 

For instance, ‘always’ is ‘not sometimes not’, or ‘ought’ is ‘not permitted that not’. Such algebraic duality patterns are so ubiquitous that in the 1950s, it was even proposed to include them in broadcasts into outer space announcing our presence to other galactic civilizations – a truly fitting endeavor for possible worlds theorists.

By now the marriage between necessity, possible worlds, and universal quantification over these has become so ingrained that it may be hard to imagine other approaches. Nevertheless, other semantics exist for modal notions, such as the topological models we will mention later, that generalize possible worlds models in the accessibility style, and even predate them historically. In fact, it is one of those ironies of scientific life that this more general semantics was already explicit in the 1930s in Tarski’s work on modality in topology and algebra, but it did not ‘take off’ the way the possible worlds paradigm did in the 1950s. And we need not even think semantics and model theory only: a perfectly good alternative view of modality comes from proof theory. A proof-theoretic explanation of the surplus of stating a necessity \Box \phi over plain truth of \phi is the existence of some strong a priori argument for \phi, perhaps a mathematical proof.

Interestingly, on the latter understanding, the ‘intensional surplus’ of modality comes as an existential, rather than a universal quantifier. And yet the proof-theoretic interpretation validates many base laws that also hold for the universal quantifier. For instance, the well-known law of Modal Distribution

    \[\Box(\phi \rightarrow \psi) \rightarrow (\Box \phi \rightarrow \Box \psi)\]

is valid on both views, though for intuitively different reasons. With universal quantification in models, it reflects the predicate-logical law

    \[\forall x(\phi \rightarrow \psi) \rightarrow (\forall x\phi \rightarrow \forall x\psi)\]

while in terms of the existence of proofs, it says that proofs for \phi and for \phi \rightarrow \psi can be combined into one for \psi. This harmony between the existence of a proof of a formula and its universal truth in some suitable semantic universe is of course not unheard of: it will be familiar to students of completeness theorems for logical systems.

Modal logic is at work in many disciplines beyond philosophy, as one can see in the 2006 Handbook of Modal Logic or the conference series Advances in Modal Logic. Van Benthem 2010 is a textbook in modal logic with the same broad thrust. This is the breadth of the field that we are after in this article, though, in the context of an Encyclopedia like this, we will be making special reference to interfaces of modal logic and philosophy, past and present, at various places.

2. A Very Brief History of Modal Logic

Aristotle already considered a calculus for reasoning with modal syllogistic forms like “every P is necessarily Q”. The topic continued in the Middle Ages, and we still find modality firmly entrenched as a major logical notion in the famous Table of Categories in Kant’s Kritik der Reinen Vernunft. All this was swept aside in the extensional turn of Frege’s Begriffsschrift in 1879. On one telling page the author enumerates a list of things for which he sees no need – and readers of some erudition will recognize the anonymous enemy as Kant’s Table of Categories. Nevertheless, in this century modal notions made their way back onto the logical agenda, leading to extensions of classical systems with operators of necessity, possibility, entailment, and other notions.

Over time, these formalisms have become influential as a tool for analyzing a wide range of philosophical arguments about various modal notions, such as the many beautiful examples of temporal reasoning in Prior 1967. But non-philosophical applications were never far away, starting with mathematics. Gödel 1933 showed how to embed Heyting’s intuitionistic propositional logic faithfully into the modal logic S4, Tarski 1938 showed how to axiomatize modal structures in topological spaces, and the classic paper Jónsson–Tarski 1951 provided a seminal technical apparatus for modal logic in terms of universal algebra, with representation theorems going to accessibility-based possible worlds models. Nevertheless, it is often thought that modal logic is the tool par excellence for philosophical logic, giving the practitioner just the right expressive finesse to deal with metaphysical modality, time, space, knowledge, belief, counterfactuals, deontic notions, and so on. The Handbook of Philosophical Logic (Gabbay & Guenthner, eds., 1981–1987, 2001–2007) has a wide range of pertinent illustrations, also for many topics in this article. However, our focus implies no claim to exclusivity: for some philosophical fare the right conceptual cutlery may be first- or higher-order logic rather than modal logic. Or better yet, as we shall see soon, one can use both.

In some circles, modal logic still has a flavor of ‘alternative’ logic, a sort of counter-culture to standard systems like first-order logic. Some philosophers see the intensional character of modality as a challenge to, rather than a natural extension of extensional notions. It also seems the view enshrined in some fashionable terminology calling modal formulas not ‘true’ in models, as one does for ordinary logical languages, but ‘forced’ in some mysterious manner. This impression of exoticness is wildly obsolete, and modal languages will be a standard part of the heartland of logic in the perspective taken later on, applying also to a variety of standard topics in mathematical logic.

Moving beyond philosophy and mathematics, since 1970, modal logic has come to flourish at interfaces with linguistics: compare the treatment of intensional operators and verbs in Montague 1974, the modal grammar of Blackburn & Meyer Viol 1994, or modal logics of context in linguistics and AI such as the one in Buvac & Mason 1994. It has also thrived in computer science with dynamic or temporal logics of programs, logics of spatial structures, or modal description logics for knowledge: see the Handbooks van Leeuwen ed. 1991, Abramsky, Gabbay & Maibaum eds. 1992, Gabbay, Hogger & Robinson eds. 1997, Aiello, Pratt & van Benthem eds. 2007, or monographs such as Fagin, Halpern, Moses & Vardi 1995, Harel, Kozen & Tiuryn 2000. In fact, the range of applications is still growing, with seminal uses of modal logic in economics (for example, logics of knowledge in the foundations of game theory: see Leyton-Brown & Shoham 2008, Perea 2011), or new ventures in argumentation theory (Grossi 2010). We cannot compile a representative bibliography for the field in an article like this. Suffice it to say that the bulk of modal logic research today, both applied and pure, takes place inside or close to computer science and related fields.

Restated more in terms of themes, the major interpretation of modal formalisms these days fall under two main headings: information and action (van Benthem & Blackburn 2006). A typical modal formalism for analyzing information (though by no means the only one) is ‘epistemic logic’ where possible worlds are viewed as epistemic alternatives to the actual world, and the universal modality \Box \phi expresses knowledge in the sense of having the semantic information that \phi holds. A well-known formalism for action is ‘dynamic logic’ where worlds are states of some computational process, and a labeled modality [a]\phi says that all states reachable from the current one by performing action a satisfy \phi. We will discuss both of these interpretations in more detail below. The fact that modal laws can be similar in both cases also highlights a deep conceptual duality between information and action that has also been noted by philosophers.

In this process of expansion, but also for internal theoretical reasons that we shall see, modal operators are now often viewed as a special kind of ‘bounded quantifiers’, making modal logic, not an extension of classical logic, but rather a fragment in terms of its expressive power over possible worlds. As such its attraction acquires a new flavor. Rather than being baroque extensions of the sort that Frege rejected, modal languages have a charming austerity, and they demonstrate how ‘small is beautiful’.

But emphasizing distance from the original philosophical habitat may be misleading. Expats may return to their homeland, and indeed, many modern themes and results of modal logic make sense inside contemporary philosophy. They find continued and even reviving spheres of application in metaphysics, mereology, epistemology, meta-ethics, and other areas – and one might even make the case that information and action are just as crucial notions to philosophy as the original metaphysical modalities.

3. The Basic System: Modality on Graphs

In this section, we review the basic system of propositional modal logic, emphasizing key technical features. With this in place, we will survey extensions in later sections, while ending this article with a few deeper excursions to the contemporary scene.

Basic setting Our basic idea is simply this: we describe properties of directed graphs consisting of points (‘possible worlds’ if you like grandeur) with directed links encoded in an ‘accessibility relation’ between points. A universal modality \Box \phi is true at a point in a graph if \phi is true at all points reachable by a directed arrow. Graphs are ubiquitous in many areas, and they are a good abstraction level for understanding what modal logic is about. And as we all know, pure high mountain air is good for you.

The basic modal language is a useful laboratory for logical techniques. We sketch the basic modal logic of graphs, including the usual topics of language, semantics, and axiomatics. But sticking to only these would mean ordering only part of the full menu available today, depriving you of acquiring a richer palate. So, we will serve you richer fare in what follows, allowing you to appreciate more of a broader literature.

Language and semantics We interpret formulas in models \mathbf{M} = (W, R, V), that may be viewed as directed graphs (W, R) with annotations for proposition letters, given by the valuation V sending each proposition letter p to the set of points V(p) where p is true. When evaluating complex formulas, one can take either the existential or the universal modality as a primitive (both have their comfort zones in logical research):

\mathbf{M}, s \vDash \Diamond \phi iff for some t with Rst\mathbf{M}, t \vDash \phi

\mathbf{M}, s \vDash \Box \phi iff for all t with Rst, \mathbf{M}, t \vDash \phi

It helps to think of points in W as states of some kind, while accessibility encodes dynamic moves that can be made to get from one state to another. But there are many other useful views of these ‘decorated graphs’, including complete ‘worlds’.

As an example, consider the following graph:

diagram 01

Using the above truth definition, the formula \Diamond\Box\Diamond p is true at 1, 4, but it is false at 2, 3.

One conceptual finesse should be stressed here that is often ill-understood. Some critics find the ‘points’ in this picture too unstructured and poor to model lush possible worlds in some pre-theoretical philosophical sense. But the total modal structure of a point includes its environment, with all its interactions with other points through the relation R. This is more like way we think of ‘objects’ in category theory as given not so much by their internal structure as by their pattern of functional interactions with other points. Indeed, modal models can be viewed as categories, and this, too, has proved a valid and rich interpretation – even though it is beyond the scope of this article.

Remark. There is a continuing historical discussion about the origins of this semantics. Often-quoted papers are Kripke 1959, 1963, but there were predecessors on the other side of the Atlantic, of which we mention Kanger 1957. To avoid taking sides through terminology, in this article, we choose neutral terms such as ‘models’ and ‘frames’.

Expressive power and invariance for bisimulation Languages are used to define and say things, a communicative function that may even be prior to reasoning. The expressive power of a modal language, or indeed any language, can typically be measured by a notion of similarity between different models, telling us what differences in structure the language can and cannot detect. Mathematically, such an analysis calls for a suitable ‘invariance relation’, or philosophically: a ‘criterion of identity’, between models – and finding one is a test on whether one has really understood a given logic. Here is an invariance relation that fits the basic modal language: it is not standard fare in philosophical textbooks, but learn it, and you will have entered the realm of modern modal logic.

Definition A bisimulation between two models \mathbf{M}, \mathbf{N} is a binary relation E between points m, n in the respective models such that, whenever m E n, then (a) m, n satisfy the same proposition letters, (b1) if m R m', then there exists a world n' with both n R n' and m' E n', (b2) the same ‘zigzag clause’ holds in the opposite direction.

Together, this atomic harmony for proposition letters plus the two dynamic zigzag clauses that can be called again and again, make bisimulation a natural notion of process equivalence tracking possible evolutions of a process step by step. Indeed, this notion was discovered independently in modal logic, computer science, and set theory.

Here is an example, disregarding proposition letters for simplicity. The two black worlds in the depicted models \mathbf{M}, \mathbf{N} are linked by a bisimulation consisting of all matches marked by dotted lines – but there is no bisimulation that includes a match between the black worlds in the following models \mathbf{N} and \mathbf{K}:

diagram 2

Here is a first case of ‘fit’: modal formulas are invariant for bisimulation.

Invariance Lemma If E is a bisimulation between \mathbf{M} and \mathbf{N} with m E n,
then m, n satisfy the same modal formulas.

In particular, we can show the failure of bisimulation between the above models \mathbf{N}, \mathbf{K} by noting that \mathbf{N} satisfies the modal formula \Diamond\Diamond\Box\bot (with \bot for the constant formula ‘false’) in its root (marked as a black dot), whereas \mathbf{K} does not.

The converse to the Lemma only holds for a modal language with arbitrary infinite conjunctions and disjunctions – or for the plain modal language over special models.

Proposition If m, n satisfy the same modal formulas in two finite models \mathbf{M}, \mathbf{N}, then there exists a bisimulation E between \mathbf{M}, \mathbf{N} with m E n.

There are many further definability results in modal model theory. For instance, for any model \mathbf{M}, s with designated point s, there is an infinitary modal formula \phi\mathbf{^{M,s}} true in only those models \mathbf{N}, t that are bisimilar to \mathbf{M}, s (that is, some bisimulation links t to s). Deeper model-theoretic studies of definability aspects of modal logic can be found in Blackburn, de Rijke & Venema 2001, Blackburn, van Benthem & Wolter, eds. 2006.

Invariance is of independent interest for its emphasis on comparisons between different models, a topic that seems somewhat neglected in philosophical logic. Barwise & van Benthem 1999 even have interpolation theorems casting bisimulation in the role of ‘transfer inference’, allowing us to find out facts about one model by reasoning about another model sufficiently ‘like it’. This brings us to the second main aspect of logic, providing a calculus of reasoning for the intended area of application.

Validity, proof systems, deductive power Universal validity in the basic modal logic is axiomatized in Hilbert-style by a system called the minimal modal logic K (for Kripke):

(a) all laws of propositional logic

(b) a definition of \Diamond \phi as \neg\Box\neg\phi

(c) the modal distribution axiom \Box(\phi\rightarrow \psi) \rightarrow (\Box \phi\rightarrow\Box \psi)

(d) the necessitation rule “if \vdash \phi, then \vdash \Box \phi

This looks like a standard axiomatization of first-order logic with \Box as \forall, and \Diamond as \exists, but leaving out first-order axioms with tricky side conditions on freedom and bondage of terms:

\forall x\phi \rightarrow [t/x]\phi and
\phi \rightarrow \forall x\phi

Modal deduction is simple quantifier reasoning in a perspicuous variable-free notation. Many other formats for modal proof systems exist, such as sequent calculus or natural deduction. Modal proof theory is still an area in progress (Wansing, ed. 1996), but important strides are being made (compare Negri 2011).

Mathematical theory Starting from the 1970s, an extensive mathematical theory has sprung up for basic modal logic, including model theory and proof theory, while using perspectives from universal algebra. Instead of listing the classical references, we refer the reader to a modern monograph like Chagrov & Zakharyashev 1996, or the Handbook Blackburn et al. eds. 2006. In this article, we only mention a few highlights.

Translation and invariance One basic technique for putting modal logic in a broader perspective is a translation T from modal formulas \phi to first-order formulas T(\phi) with one free variable x having the same truth conditions on models \mathbf{M}, s:

(a) T(p) = Px,

(b) T commutes with Boolean operators,

(c) T(\Diamond \phi) = \exists y(Rxy \& [y/x]T(\phi)), T(\Box f) = \forall y(Rxy \rightarrow [y/x]T(\phi)).

With some care, only 2 variables x, y are needed in these translations (free or bound). For instance,

    \[\Box\Diamond\Box p\]

translates faithfully into

    \[\forall y(Rxy \rightarrow \exists\mathbf{x}(Ry\mathbf{x} \wedge \forall y(R\mathbf{x}y \rightarrow Py)))\]

Here is the essential semantic feature that makes these translated modal formulas special inside the full first-order language over the signature R^{2}, P^{1}, Q^{1}, \ldots of models:

Modal Invariance Theorem The following assertions are equivalent for all first-order formulas \phi = \phi(x): (a) \phi is equivalent to a translated modal formula, (b) \phi is invariant for bisimulations.

The resulting modal fragment of first-order logic turns out to share nice properties of the full system such as Compactness, Interpolation, Löwenheim-Skolem, model-theoretic preservation theorems, and others. This is not automatic inheritance, and classical meta-proofs often need to be adapted creatively using bisimulation. But unlike first-order logic, modal logic is decidable – showing fine-structure inside classical logic: with a delicate balance between expressive power and computational complexity.

The fragment perspective is quite general: many other modal languages live inside first-order logic or other standard logics under some translation for their standard semantics. We will see later what makes these fragments so well behaved.

Landscapism A typical feature of modal logic has to do with its historical proliferation of deductive systems: ‘modal logics’ of different proof strength inside the same basic language. On top of the minimal logic, there are uncountably many different normal modal logics given by the same rules of inference as above plus various sets of axiom schemata. This deductive landscape has two major highways, because of the following:

Theorem Every normal modal logic is either a subset of the logic Id with characteristic axiom \phi \leftrightarrow \Box \phi, or of Un with axiom \Box\bot.

On the former road lie well-known systems like T, S4, S5, but the latter road has landmarks such as Löb’s logic of arithmetical provability axiomatized by

    \[\Box(\Box \phi \rightarrow\phi) \rightarrow \Box \phi\]

Logics in this deductive landscape can be studied by proof-theoretic methods, but also semantically – once we find completeness theorems bridging the two realms.

Completeness Let us now turn to the way in which modal logics viewed as deductive systems are correlated with semantic models. A typical completeness theorem is this:

Theorem A modal formula is provable in K4 (minimal K plus the axiom \Box \phi \rightarrow \Box\Box \phi) iff it is true in all models whose accessibility relation is transitive.

There are many techniques for proving such results, ranging from simple inspection of the canonical Henkin model of all complete theories in the logic to forms of drastic model surgery. The demand for completeness theorems comes from two sides. Either one has a pre-existing modal logic given by syntactic axioms and rules (like many first-generation modal systems), and seeks a useful matching model class – or one has a natural model class (say, some interesting space-time structure), and wishes to axiomatize its laws for simple modal reasoning. The literature is replete with both. In this survey, we do not pursue either line, but they are very well-documented (Blackburn, de Rijke & Venema 2001, Chagrov & Zakharyashev 1996, amongst many sources).

Correspondence The correspondence between modal axioms and special properties of the accessibility relation in a class of models continues to be one of the major attractions of modal logic. It can be studied directly, calling a modal formula true in a frame (W, R) (a model stripped of its valuation) if it holds under all valuations. Many modal axioms then correspond to simple first-order properties. The Sahlqvist Theorem describes an effective method constructing first-order equivalents from modal axioms of a suitable shape, which has by now reached the world of automated theorem proving. It proceeds by substituting first-order descriptions of ‘minimal valuations’ into the first-order translation of a modal axiom to get a natural first-order equivalent, if available.

As an instance of this procedure, a K4 axiom

    \[\Box p \rightarrow \Box\Box p\]

has a first-order translation

    \[\forall y(Rxy \rightarrow Py) \rightarrow \forall y(Rxy \rightarrow \forall z(Ryz \rightarrow Pz))\]

A minimal valuation for p making the antecedent true is Pu = Rxu. Substituting this, and dropping the tautological antecedent, we obtain

    \[\forall y(Rxy \rightarrow \forall z(Ryz \rightarrow Rxz)))\]

that is, frame transitivity. Non-first-order principles are the McKinsey Axiom

    \[\Box\Diamond p \rightarrow \Diamond\Box p\]

and our earlier Löb Axiom.

Correspondence theory has produced many general results. One classic is a theorem in Goldblatt & Thomason 1975 that we state for its form only, omitting details. A first-order frame-property is modally definable iff it is preserved under taking (a) generated subframes, (b) p-morphic frame images, (c) disjoint unions, and (d) inverse ultrafilter extensions. Correspondence theory involves a study of simple modal fragments of the complex realm of monadic second-order logic, a perspective we will not pursue here.

Digression This is the classical view of correspondence (van Benthem 1984). But one can always rethink orthodoxy. Are the usual ‘modal logics’ with their special axioms really logics, or theories of special domains over a unique minimal logic? Special frame properties are nice, but they may be in need of further explanation that suggests alternative views. For example, transitivity is an effect of closing an accessibility relation under iterations, and then K4 is the logic of a special closure modality definable in just the minimal K-style ‘dynamic logic’ to be discussed below.

Next, we move to two basic themes that have risen to prominence since the late twentieth century—not just in modal logic, but also for logical systems generally.

Computation The basic modal language is a decidable miniature of first-order logic. There are many decision methods for validity or satisfiability exploiting special features of modal formulas – each with their virtues. Well-known methods are selection, filtration, and reduction, for which we refer to the literature (Marx 2006).

But there is a deeper issue here, going beyond the traditional understanding of logical systems. What is the precise computational complexity of various key tasks for a logic, allowing us to gauge its difficulty as a device to be used seriously? These key tasks include testing for satisfiability, but also model checking for truth, as well as comparing models. Here are the facts for the basic modal logic. (a) Given a finite model \mathbf{M}, s and a modal formula \phi, checking whether \mathbf{M}, s \vDash \phi takes polynomial time in length(\phi) + size(\mathbf{M}). This is better than for first-order logic, where this task takes polynomial space. (b) Checking if a modal formula \phi has a model takes polynomial space in the size of \phi. For first-order logic, this is undecidable. (c) Checking if there is a bisimulation between finite models \mathbf{M}, s and \mathbf{N}, t takes polynomial time in the size of these models.

These benchmark complexities for logics differ as languages are varied. Complexity awareness may be a new feature to many logicians and philosophers, but computational behavior seems a feature of basic importance in understanding formal frameworks.

Interaction and games The modern view of computation is one of interactive agency (compare the AAMAS conferences, http://www.ifaamas.org/index.html), and accordingly, games provide a new perspective on logics (van Benthem 2014), including modal logic. In a modal evaluation game, two players Verifier (V) and Falsifier (F) disagree about a formula at point s in a given model \mathbf{M}. Disjunction is a choice for V, conjunction for F, negation is a role switch, \Diamond makes V pick a point reachable from the current point, \Box does the same for F. A game p is won by V if the atom p holds at the current point, otherwise by F. A player also wins if the opponent has no move for a modality.

The crucial equivalence governing this game is as follows:

Fact \mathbf{M}, s \vDash \phi iff Verifier has a winning strategy for the \phi-game in \mathbf{M} starting at s.

Here is an example. Our first model picture when we introduced the basic semantics induces the following tree for an evaluation game for the formula

    \[\Diamond\Box\Diamond p\]

starting from point 1, with boldface indicating the winning positions for Verifier:

diagram 3

In this game, V has two winning strategies: left and right, <right, down>. These are indeed the two possible successful ways of verifying \Diamond\Box\Diamond p in the given model at point 1.

This style of analysis is widespread in the current literature. There are also model comparison games between players Duplicator (maintaining an analogy) and Spoiler (claiming a difference), playing over pairs of points (m, n) in two given models \mathbf{M}, \mathbf{N}. This may be seen as a fine-structured way of checking for existence of a bisimulation, where successor states chosen in one model by Spoiler must be matched by successors in the other model, chosen by Duplicator, while atomic harmony always remains. Without providing details, we note that in games like this, (a) Spoiler’s winning strategies in a k-round game between \mathbf{M}, s and \mathbf{N}, t match the modal formulas of operator depth k on which the points s, t disagree, (b) Duplicator’s winning strategies over an infinite round game between \mathbf{M}, s, \mathbf{N}, t match the bisimulations linking s to t.

Many other logical notions can be ‘gamified’. For instance, proof games find deductions or counter-examples through a dialogue between two players about some initial claim. And always, the logical core notion turns out to match a strategy for interactive play.

This completes our sketch of basic modal logic as a meeting place for a wide range of logical notions, techniques, and results. In Section 4, we look at some concrete modern applications in more detail, and then in Section 5, we identify further general issues to which these give rise, reinforcing the role of modal logic as a conceptual lab.

4. Some Active Current Applications

We have given some information on the attractions of the basic system of abstract modal logic. At the same time, it is also important to see that many different concrete interpretations can be attached to this system, and how diverse these are.

Knowledge and belief One of the major interpretations of modal logic in use today reads modalities as operators of knowledge or belief (Hintikka 1962, Stalnaker 1984), though this reading is itself a subject of ongoing debate. Languages like this express many further basic epistemic patterns that occur in natural discourse, such as:

K_{i}\phi \vee K_{i}\neg \phi     “agent i knows whether \phi is the case”

On this interpretation, standard modal axioms acquire a new epistemic flavor, such as:

‘Positive introspection’: K_{i}\phi \rightarrow K_{i} K_{i}\phi
‘Negative introspection’: \neg K_{i}\phi \rightarrow K_{i} \neg K_{i}\phi

again readings that have been subject to critical debate.

A major new theme in the epistemic setting is a social one. No lonely thinkers are essential to cognition, but interaction between what different agents i, \phi in a group, involving what they know about each other – in patterns such as K_{i }K_{\phi }\phi or K_{i }\neg K_{\phi }\phi – are essential. What I know about your knowledge or ignorance is crucial, both to my understanding and to my actions. For instance, I might empty your safe tonight if I believe that you do not know that I know the combination. Some forms of group knowledge transcend simple iterations of individual knowledge assertions. A key example is common knowledge: if everyone knows that your partner is unfaithful, you have private embarrassment – if it is common knowledge, you have public shame. Technically, this works as follows in our models. A new common knowledge modality C_{G}\phi says that \phi holds at every world reachable via a finite chain of uncertainty relations for agents in G.

For instance, in the following picture, where epistemic accessibility is an equivalence relation, the atomic fact p holds in the current world, marked by the black dot:

diagram 4

In the current world, our semantics yields the following further facts:

(a) agent Q does not know whether p: \neg K_{Q}p \wedge \neg K_{Q}\neg p,
(b) agent A does know that p: K_{A}p, while
(c) it is common knowledge in the group {Q, A} that A knows whether p: C_{{Q, A}}(K_{A}p \vee K_{A}\neg p).

Incidentally, this is a good situation for Q to ask A the question whether p is true: but more on epistemic actions below. Common knowledge treated in a modal style is a widely used notion by now in philosophy (Lewis 1989), but also in computer science (Fagin et al. 1995) and game theory (Aumann 1977, Battigalli & Bonanno 1999).

Similar models can represent belief. This is often done a bit crudely by adding one more accessibility relation that is no longer reflexive to allow for false beliefs. But more illuminating is a richer approach (Grove 1988, Baltag & Smets 2008). Thinking of equivalence classes of the epistemic relation as the total range of what an agent knows, we endow these with binary plausibility orderings that encode what the agent considers less or more plausible. Then a belief modality B\phi is interpreted as saying that \phi is true in all most plausible epistemically accessible worlds. And plausibility models also support a richer notion, namely a binary modality of conditional belief B^{\psi}\phi saying that \phi is true in all most plausible epistemically accessible worlds that satisfy \psi. Unlike the situation with conditional knowledge, conditional belief cannot be defined in terms of absolute belief. Indeed, the logic of conditional belief is much like modal logics for conditional assertions in models with similarity relations (Lewis 1973, Burgess 1981, Veltman 1985).

Caveat Our use of the terms ‘knowledge’ and ‘belief’ is mainly a tribute to the tradition. Most philosophers and logicians no longer think of the above modalities as modeling real knowledge and belief, and think of K\phi rather as representing the agent’s semantic information (Carnap 1947), and of B\phi as what is true according to ‘the best of the agent’s information’. For a current modal study of how to model genuine notions of knowledge using more sophisticated philosophical intuitions, see Holliday 2012.

Dynamic logic of action Accessibility arrows can also be viewed quite differently, not in terms of knowledge and information, but as transitions for actions viewed as changing states of some relevant process, a computation, or a general course of events (Harel, Kozen & Tiuryn 2000). Modalities now get labeled with explicit action expressions to show what they range over. In dynamic logic – originally designed to describe execution of computer programs, but now used as a general logic of action,

[\pi]\phi says that after every successful execution of action \pi, \phi holds.

Read in this way, modal statements now relate actions to ‘postconditions’ describing their effects and also to ‘preconditions’ for their successful execution. Concrete models of this sort are process graphs describing the possible workings of some computer or abstract machine. For instance, a labeled formula [a]< b >p says that, at the current starting state, after every execution of action a (there may be zero, one or more ways of doing this), it is possible to then perform action b to achieve a state where p holds.

Another concrete model for dynamic logic are games, where actions are moves available to several players. For instance, in the following game tree, player E has a strategy for achieving an outcome satisfying p against any play by player A:

diagram5

This strategic assertion is captured by the modal formula [a\cup b]<c\ \cup \ d>p.

Again we get a minimal modal logic, this time a two-level system treating propositions and actions denoting transition relations on a par. This joint setup allows for an analysis of important action constructions, encoded in valid principles of dynamic logic:

 

    \[[\pi ; \pi']\phi \leftrightarrow [\pi][\pi']\phi\]

sequential composition
 

    \[[\pi \cup \pi']\phi \leftrightarrow ([\pi]\phi \wedge [\pi']\phi)\]

choice
 

    \[[(\phi)]\psi \leftrightarrow (\phi \rightarrow \psi)\]

test for proposition \phi

A major new feature here is unbounded finite repetition of actions: \pi^{*}. This notion is typical for computation, but also for action in general (‘keep adding salt to bring up to taste’) and it is not first-order definable. This shows in two more axioms:

 

    \[[\pi^{*}]\phi \leftrightarrow (\phi \wedge [\pi][\pi^{*}]\phi)\]

fixed-point axiom
 

    \[(\phi \wedge [\pi^{*}](\phi \rightarrow [\pi]\phi)) \rightarrow [\pi^{*}]\phi\]

induction axiom

Dynamic logics resemble infinitary fixed-point extensions of classical logic, but with a modal stamp: like the basic modal logic, they are bisimulation-invariant and decidable, forming a core calculus for reasoning about the essentials of recursion and induction. Fixed-point definitions are ubiquitous in computer science, mathematics and linguistics, as many natural scientific notions involve recursion. An elegant powerful system of this kind generalizes dynamic logic by adding a facility for arbitrary fixed-point definitions: the so-called \mu–calculus that we will consider briefly below.

Information update Different kinds of modal logic can also form new combinations. For example, the logics of information change by combining knowledge and action. Our earlier epistemic formulas tell us what information agents have right now, but they do not say how this information changes, through acts of observation, communication, or learning in general. To model such cognitive actions, we need to combine epistemic and dynamic logic. One powerful idea here is an information update changes the current epistemic model. In the simplest case, reflecting a ubiquitous common-sense intuition, this update mechanism works as follows, decreasing the current epistemic range:

a public announcement\phi of a proposition \phi to a group of agents eliminates all worlds in the current epistemic model \mathbf{M} that satisfy \neg \phi

Suppose that in our earlier two-agent two-world picture, Q asks A: “p?” and A then truthfully answers “Yes”. Then the \neg p-world gets eliminated, and we are left with a one-world model where p has become common knowledge among {Q, A}.

But more subtle cases are possible, even with simple models. For example, a question itself may convey crucial information. By asking, Q conveys the information that she does not know whether p. Even if A did not know the answer at the start, this may tell him enough to settle p, and now answer the question. Here is a case where this happens:

diagram6

But the modeling power of epistemic dynamics is still higher. Suppose that neither Q nor A knew whether p, but A asks expert R, who answers only to A. Then A learns whether p, Q is no wiser about p, but it has become common knowledge that A knows if p. This private act requires a new update changing models by ‘link elimination’:

diagram 7

The modal logic of update has some delicate features. For instance, a public announcement that some formula \phi is the case need not always result in our learning that \phi holds in the updated model. The reason is that truth value switches may happen when announcing formulas \phi that contain a statement of ignorance. A well-known example is ‘Moore sentences’ of the form p \wedge \neg Kp, which become false after announcement.

Algorithms for model updates covering a wide range of communicative acts, public or private, and matching complete modal logics for formulas [\phi]K\psi have been studied extensively in dynamic epistemic logic (Baltag, Moss & Solecki 1998, van Ditmarsch et al. 2007, van Benthem 2011). Similar logics can deal with acts of belief change, triggered by the above events \(\phi\) of public ‘hard information’ or also softer triggers rearranging the current plausibility ordering (‘soft information’) to a new model supporting suitably modified absolute and conditional beliefs. Actions of plausibility change have been studied in belief revision theory (Gärdenfors & Rott 1995, Segerberg 1995), in dynamic-epistemic logics (see the earlier references on this field), and in formal learning theory (Kelly 1996, Gierasimczuk 2010).

Intuitionistic logic and provability logic Let us now move from information and action to the grand themes of mathematics. Modal logic has also been used to model constructive reasoning as encoded in intuitionistic logic where truth is reinterpreted in terms of being established, or having a proof (Kripke 1965, Troelstra & van Dalen 1988). Our earlier models can now be viewed as universes of information stages, and accessibility is upward extension. Intuitionistic logic is then about persistent assertions that, once established, remain true upward in the information order. In particular, as mentioned earlier, Gödel 1933 gave a faithful translation from intuitionistic logic into the modal logic S4, reading intuitionistic conjunction and disjunction as their standard counterparts, but sending intuitionistic negation ~ to the strengthened modal combination \Box\neg, and intuitionistic implication \phi \rightarrow \psi to the modalized material implication \Box(\phi \rightarrow \psi). The full modal language also contains non-persistent assertions beyond the translated intuitionistic language that fit with some earlier-mentioned epistemic statements such as Moore sentences that may become false after updating with new information.

Another proof-oriented interpretation of the modal language occurs in provability logic (Boolos 1993, Artemov 2006). Here the box modality \Box \phi gets interpreted as existence of a proof in some formal system of arithmetic. Note that this interpretation contains an existential, rather than a universal quantifier, as noted in our introduction. This view validates the laws of the minimal modal logic K, as well as the K4 transitivity axiom, that can now be read as saying that given proofs can be proof-checked for correctness. But this interpretation also validates Löb’s Axiom

    \[\Box(\Box \phi \rightarrow\phi) \rightarrow \Box \phi\]

This expresses a deep fact about arithmetical provability – and in fact, provability logic and its many extensions are decidable modal core theories of high-level features of mathematical provability in theories that have the coding power to discuss their own metatheory.

Temporal and spatial logic Still close to mathematics, another lively application area of modal logic concerns physical rather than human nature. A concrete interpretation of models is as flows of time, with accessibility as the temporal order ‘earlier than’ between points. The universal modality then says “everywhere in the future”, with a natural dual “everywhere in the past”. Temporal logics occur in linguistics and philosophy of language (Prior 1967), philosophy of science and philosophy of action (Belnap et al. 2001), but they have also reached computer science and AI, where they show a great diversity beyond the modal point of departure (see Abramsky, Gabbay & Maibaum eds. 1992, Gabbay, Hogger & Robinson eds. 1995). In particular, they can live over different primitive entities: durationless points, or extended periods (van Benthem 1983). The vocabulary of temporal logics is richer than the basic modal language. A typical case are operators saying what goes on during a successful transition: UNTIL \phi \psi says that at some point later than now \phi holds, while at all intermediate points \psi holds.

In this same physical arena, modal logics of space are gaining importance, again in use both in philosophy of science and in knowledge representation in computer science. One of these revives an old idea from the 1930s. Let our modal models be topological spaces endowed with a valuation assigning distinguished subsets to proposition letters (Tarski 1938, Aiello et al., eds. 2007). Then the modality \Box \phi may be read as saying that:

the current point lies in the topological interior of the set [[\phi]] of all points where \phi holds.

In this way, modal laws come to encode topological facts about space. For instance,

\Box(\phi \wedge \psi) \leftrightarrow (\Box \phi \wedge \Box \psi) says that open sets are closed under intersections.

In fact, this interpretation validates all and only the theorems of the modal logic S4. The topological style of analysis extends to modal fragments of geometry. It provides a wide-ranging extension of our standard semantics quantifying over reachable points in graphs, which it contains as a special case. Technically, it suggests a generalized modal semantics in terms of neighborhood models, of a sort developed in the 1960s to explore axiomatic systems below the minimal modal logic K (compare Segerberg 1971, Chellas 1980, Hansen, Kupke & Pacuit 2008) by generalizing the realm of standard relational models.

We do not intend a complete survey of all possible perspectives on modality in this article. One can consult the Handbook of Philosophical Logic for a wide array of uses that have been developed since the 1960s. To conclude here, we just mention one appealing concrete setting where many of the above strands come together naturally (van Benthem 2014).

Agency and games Consider several agents interacting strategically, the natural scenario in much of social life. To see what we are after, consider the simple game depicted in the following tree where players have preferences encoded in pairs:

(value for \mathbf{A}, value for \mathbf{E}).

The standard solution method of ‘Backward Induction’ for extensive games (compare the textbook Osborne & Rubinstein 1994) will analyze this game bottom up, telling player \mathbf{E} to go left at her turn, which then gives player \mathbf{A} a belief that this will happen – and so, based on this belief about his counter-player, \mathbf{A} should turn left at the start. The resulting strategy is indicated by the two bold face lines:

diagram8
This may be surprising, as the outcome (99, 99) is better for both than reaching (1, 0). So,
why should players act this way, and what are plausible alternatives? To answer such questions, a logical approach tries to understand the reasoning underlying Backward Induction. Interestingly, that reasoning is a mix of many modal notions often studied separately. It is about actions, players’ knowledge of the game, their preferences, but also their beliefs about what will happen, their plans, and counterfactual reasoning about situations that will not even be reached with the plan decided on. Thus, well-understood, one extremely simple interactive social scenario involves about the entire agenda of philosophical logic in a coherent manner.

As a case study, the bridge law for the mix of philosophical notions driving Backward Induction is rationality: “players never choose an action whose outcomes they believe to be worse than those of some other available action”. Evidently, this statement is packed with assumptions, and logic wants to clarify these, rather than endorse any unique game-theoretic recommendation. For instance, Stalnaker 1999 analyzes games in terms of additional information about players’ policies for belief revision, another area of modal logic as explained above. Thus, once we understand the standard reasoning, we can also come up with alternatives: logic helps us see the laws, and break them.

Game logics The preceding example suggests that a number of modal logics needs to be put together in some appropriate way. We only give one illustration, but see van Benthem 2014 for more examples. One interesting mix of our earlier epistemics and dynamics occurs in imperfect information games, where players may not know the precise moves played by their opponents. Thus, in these games, the primary epistemic uncertainty is between actions, and only in a derived sense between the resulting game states. Think of a card game where we cannot observe which initial hand Nature is dealing to our opponent, or where some mid-play moves by our opponents may be partially hidden.

Consider the earlier game tree, but now with an uncertainty link for player E at the second stage – she does not know the opening move played by A:

diagram 9

This is a model for a joint language with epistemic modalities K_{i} and dynamic [a] that interact. Halfway, player E knows ‘de dicto’ that she has a winning move:

K_{E}(<\!c\!>\!p\ \ \vee <\ \!d\!>\!p)

but she does not know any particular winning move ‘de re’:

\neg K_{E}p\ \ \& \ \ \neg K_{E}p

This expresses the fact that the game depicted here is ‘non-determined’: E cannot force an outcome p, but neither can A force outcome \neg p for the game.

The general logic of imperfect information games is the minimal dynamic logic plus epistemic ‘multi-S5’. But on top of that, the combined dynamic-epistemic language can also express modes of playing games. Take the basic game-theoretic notion of ‘Perfect Recall’. This describes players whose own actions never introduce uncertainties they did not have before. Properly understood this validates a modal interchange axiom:

turn_{E} \ \ \& \ \ K_{E}[a]\phi) \rightarrow [a]K_{E}\phi

saying that what we know about the result of our own game moves is still known to us after we perform them. (To understand this, contrast the different effect of non-epistemically neutral actions such as drinking.) Thus, special modal axioms in this epistemic-dynamic language correspond with special styles of playing a game.

Of course, there are many other modal aspects to the above story. Games are not just driven by actions and information, but crucially also by players’ goals, depending on their preferences between outcomes. Thus game logics link with modal logics of preference (Von Wright 1963, Hansson 2001, Liu 2011), and with deontic logics of agents’ obligations, rights and duties (Hilpinen 1970, 1981, or the proceedings of the DEON conferences, http://www.deonticlogic.org/). Each of these represents an area of its own with ramifications in philosophy and computer science, witness the following two references: Gabbay & Guenthner, eds., 1981, and Shoham & Leyton Brown 2008.

And so on Modal logic keeps finding new interpretations, and no attempt can be made here to list all its current manifestations, or, in some cases, independent rediscoveries. For instance, we omitted description logics for knowledge representation (Baader et al. eds. 2003), modal logics for webpage languages (ten Cate & Marx 2009), argumentation systems (Grossi 2010), epistemology (Holliday 2012), (Hawke 2015), and so on. This process is likely to go on, since the earlier-mentioned expressiveness/complexity balance of modal languages is a natural zoom level on many topics under the sun.

5. Modern Themes across the Field

We have sketched a few basic features of the classical theory of deduction and definability in modal logic, added a few further themes such as invariance and complexity, and then presented a wide array of current applications or manifestations of modal logic. Of course, there are no simple divisions between pure and applied in logic (or anywhere): applications themselves generate theoretical issues, and in this section, we outline a few themes from the 1990s onward that play across many different application areas.

Extended modal languages and hybrid logics The basic modal language is just a starting point for the analysis of modal notions, though it has acquired a sacred status over time, making extensions seem like foul play to some. Modal languages can be naturally enriched over their original models, and this has happened often, starting with the work of Prior on temporal logic. A well-known extension of this sort adds a universal modality U\phi saying that \phi is true at all worlds, accessible or not. This may look like adding all of first-order logic, but this is by no means the case: the universal modality stays inside the decidable two-variable fragment of first-order logic, at a modest price in computational complexity. The \Box, U language has a matching invariance as before, now with ‘total bisimulations’ whose domains and ranges are the whole models being compared.

The more general move here is toward hybrid logics (Goranko & Passy 1992, Blackburn & Seligman 1995, Areces & ten Cate 2006) that add more expressive power to the basic modal language, One powerful hybrid device are ‘nominals’: names for unique worlds that formalize many natural styles of reasoning. This also plugs some blatant expressive gaps in the basic modal language. For instance, much has been made of the latter’s inability to express the natural frame property of irreflexivity

    \[\forall x \neg Rxx\]

But this property is expressed quite simply by the hybrid axiom

    \[i \rightarrow \neg\Diamond i\]

using a nominal i. Nowadays, the tendency is to add such devices freely, seeking a good balance between increased expressive power and manageable complexity. Another example is the earlier temporal operator Until, which again allows for bisimulation analysis, while keeping the resulting logic decidable. An extensive study of general hybrid logics is found in (ten Cate 2005).

While the preceding moves add ‘logical’ expressive power inside first-order logic (or beyond, as we shall see), ‘geometric extensions’ enrich the similarity type of models, adding modalities with new accessibilities. An important case are polyadic languages with n-ary accessibility relations. For instance, an existential dyadic modality \Diamond \phi \psi holds at s iff \exists t, u: R^{3}s, tu, \phi holds at t, and \psi holds at u. Concrete interpretations for ternary relations R abound: ‘s is the concatenation of expressions t, u’, ‘s is the merge of the resources or information pieces t, u’, or ‘s is the geometrical sum of the vectors t, u’.

A limit to which many extensions of both types, logical and geometric, tend is the Guarded Fragment of first-order logic (Andréka, van Benthem & Németi 1998). This is defined inside full first-order syntax by allowing only quantifiers of a guarded form:

    \[\exists\mathbf{y} (G(\mathbf{x}, \mathbf{y}) \wedge \phi (\mathbf{x}, \mathbf{y}))\]

where \mathbf{x}, \mathbf{y} are tuples of variables, G(\mathbf{x}, \mathbf{y}) is an atomic formula whose variables occur in any order and multiplicity, and \phi is a guarded formula having only variables from \mathbf{x}, \mathbf{y} free. Many modalities are guarded in this syntactic sense, witness translations such as

    \[\Diamond p = \exists y(Rxy \wedge Py)\]

    \[\Diamond pq = \exists yz(Rxyz \wedge Py \wedge Qz)\]

This quite expressive sublanguage of first-order logic where groups of objects are only introduced ‘under guards’ still yields to modal analysis supporting a good meta-theory. The Guarded Fragment has a characteristic bisimulation, and it is decidable, be it now in doubly exponential time. These properties even transfer to extensions that can deal with temporal languages.

Here is what is going on now. The usual landscape of modal logics is one-dimensional: it keeps the basic language constant in expressive power and varies deductive strength of special theories expressed in it. But now we have a second dimension of variation in expressive power. This new landscape is still being charted.

Recursion, induction and fixed-point logics Another typical modern feature absent in classical modal logic are recursive definitions, whose meaning involves a process of infinite unwinding in order to reach equilibrium. In many modal systems today, recursive definitions play a role, say, for iteration of actions, common knowledge, or the description of temporal behavior on infinite histories. In principle, adding inductive definitions and recursion to classical logics leads to systems of high complexity that can encode True Arithmetic, a case in point being first-order logic with inductive definitions LFP(FO) that is widely used in finite model theory (Ebbinghaus & Flum 1995, Libkin 2012). However, modal logics are often robustly decidable, carrying such loads without exploding in complexity. Propositional dynamic logic itself was a case in point, being a small decidable core theory of terminating recursions. New abstract theories of induction and recursion are thriving, such as the following one (Pratt 1981):

The modal \mu–calculus extends the basic modal language with operators \mu p\phi(p) for ‘smallest fixed-points’ where formulas \phi(p) have the following special syntactic format. The propositional variable p occurs only positively, that is, each occurrence of p in \phi lies in the scope of an even number of negations. The semantics for this modal language is more sophisticated than what we have seen before. In particular, the special positive syntax pattern ensures that the following ‘approximation function’ for the predicate defined implicitly by the formula \phi(p).

    \[F\mathbf{^{M}}_{\phi} (X) = { s\in\mathbf{M} | \mathbf{M}, [p:= X], s \vDash \phi}\]

is monotone in the inclusion order:

whenever X \subseteq Y, then F\mathbf{^{M}}_{\phi} (X) \subseteq F\mathbf{^{M}}_{\phi} (Y).

On so-called ‘complete lattices’ – a special case that often suffices are power sets of standard modal models –, the Tarski-Knaster Theorem then says that monotone maps F always have a smallest fixed-point, an inclusion-smallest set of states X where F(X) = X. Concretely, one can always reach this smallest fixed-point F_{*} through a sequence of approximations indexed by ordinals until there is no more increase:

    \[\varnothing, F(\varnothing), F^{2}(\varnothing), ... , F^{a}(\varnothing), ... , F_{*}\]

Now, the formula \mu p\phi(p) is said to hold in a model \mathbf{M} at just those states that belong to the smallest fixed-point for the map F\mathbf{^{M}}_{\phi}. Completely dually, there are also greatest fixed-points for monotone maps, and these are denoted by formulas:

\nu p\phi(p), with p occurring only positively in \phi(p).

Greatest fixed-points are definable from smallest ones, via the valid formula:

\nu p\phi(p) \leftrightarrow \neg \mu p\neg \phi(\neg p), where \neg \phi(\neg p) has its occurrences of p positive.

The modal \mu–calculus is the decidable modal core theory of induction and recursion. Incidentally, a further example of such robust decidability is the Guarded Fragment: its fixed-point extension LGF(FO) extending the modal \mu–calculus is still decidable.

There is a fast-growing literature on the \mu–calculus (compare Blackburn, de Rijke & Venema 2000). Venema 2007 is an up-to-date study in connection with current logics for computation, where many themes that we have mentioned for the basic modal logic return in more sophisticated forms, appropriate to infinite processes.

One more general background here is the study of ‘co-inductive’ infinite processes that are not built bottom-up, but can only be observed top-down has become a thriving area of its own in the foundations of computation and games under the name of co-algebra. Modal fixed-point logics point the way toward much more abstract new modal logics that match the category-theoretic semantics of co-inductive computation (Kurz 2001).

What is striking in these developments is the merge of modal logic and automata theory and also game theory. Automata as perspicuous representations of modal formulas are affecting our very understanding of modal languages, and the resulting theory, of great power and elegance, may come to impact our understanding of the field as a whole.

System combination Another major theme in modal logic today is system combination. While single modal logics may be simple, many applications require combining several such logics, as we saw with knowledge, action, and preference in games. Here, crucially, the architecture of combined systems matters. Adding simple systems together need not result in simple systems at all. It depends very much on the mode of combination. There are several ways of combining modal logics, ranging from mere ‘juxtaposition’ to more intricate forms of interaction between the component logics. There is an incipient theory of relevant modes of combination, including new constructions of ‘product’ and ‘fibering’ (Gabbay 1996). Here we only mention one important phenomenon.

Complexity can increase rapidly when combined modal logics include what look like natural and attractive ‘commutation properties’.

Fact The minimal modal logic of two modalities [1], [2] plus the universal modality U satisfying the axiom [1][2]\phi \rightarrow [2][1]\phi is undecidable.

The reason is that such logics encode complex ‘tiling problems’ on the cross-product of the natural numbers (Harel 1985, Marx 2006). By methods of frame correspondence, the commutation axiom defines a grid structure satisfying a first-order convergence property:

    \[\forall xyz: (xR_{1}y \wedge yR_{2}z) \rightarrow \exists u: (xR_{2}u \wedge uR_{1}z)\]

Here is a diagram picturing this, creating a cell of a geometric grid:

diagram 10

This complexity danger is general, and the following two mnemonic pictures may help the reader. Modal logics of trees are harmless, modal logics of grids are dangerous!

diagram 11

Many dangerous combinations of modal systems occur in combinations of epistemic and temporal logic, and the first pioneering results were in fact proved in this area in Halpern & Vardi 1989 (compare the survey in van Benthem & Pacuit 2006).

The general topic behind system combination, and one that seems to have attracted little attention in philosophical logic so far, is the architecture of logical systems.

Modal predicate logic An important topic in philosophical applications of modal logic that we have mostly ignored in this survey is modal predicate logic. While this is faithful to the field as a whole (technically, modal predicate logic is just one of many system combinations), it is a serious omission for many purposes, and we will only partly make up for it by mentioning some current trends and supporting literature.

Many philosophical issues have to do with the nature of objects and their identification across different modal situations, as explained at length in James Garson’s chapter on modal predicate logic in the Handbook of Philosophical Logic. Modal predicate logic has been important as a hotbed of discussion, both philosophical and technical. The main semantics seems obvious, annotating the possible worlds in an accessibility graph with domains of objects with predicates familiar from models for first-order logic. But a major challenge has been how to interpret assertions:

    \[\mathbf{M}, s \vDash \Box \phi [\mathbf{d}]\]

representing a predication about objects \mathbf{d} assigned to the free variables in \phi from the domain of s. One semantics look at accessible worlds t with R st where those self-same objects occur (Kripke 1980, Hughes & Cresswell 1969), but one can also merely allow ‘counterparts’ to the \mathbf{d} in t (Lewis 1968), an idea that has returned in sophisticated mathematical semantics for modal predicate logic where objects across worlds can only be related to each other through available functions. We will not provide further details, but refer the reader to (Rabinowicz & Segerberg 1994, Gupta & Thomason 1980, Belnap et al. 2001, Williamson 2000, 2013, Holliday & Perry 2013) for sophisticated modal predicate logics, showing how the interplay of modality, objects and predication forms a natural continuation of the modal themes in this article.

Modern modal predicate logic is a sophisticated area, (Gabbay, Shehtman & Skvortsov, to appear). While many techniques for modal propositional logic extend to this area, the devil is in the details, and no consensus has emerged yet on a philosophically or a mathematically optimal framework for the whole field. In fact, some people feel that the underlying mathematical subtleties have to do with modal predicate logic being a ‘product logic’ of two systems (Gabbay, Kurucz, Wolter & Zakharyashev 2007) that are themselves modal in character: modal propositional logic, and predicate logic itself, and we are not clear yet on what is the most natural system combination here.

Other mathematical approaches While this survey largely follows standard relational models for modal logic, it is important to realize that there are several other approaches in the area that have an even broader potential for theory and practice. We elaborate briefly on a few hints in this direction given earlier in this article.

One powerful paradigm is algebraic approaches, viewing modal logic as a study of classical algebras enriched with further operators, making the subject a branch of algebraic logic (Venema 2006). Our relational models are then connected to algebras through representation theorems, a tradition started by Stone and Birkhoff in Universal Algebra, and taken to modal logic in Jónsson & Tarski 1951. In particular, viewed algebraically, modal operators can then live on quite different base logics: intuitionistic, or even much weaker ones (Andreka, Németi & Sain 2003), (Palmigiano et al. 2014).

Another important strand of models, mentioned earlier in connection with topology, are neighborhood models with built-in world-to-set relations N s X and a crucial truth clause

\mathbf{M}, s \vDash \Box \phi iff there is a set X with N s X and \mathbf{M}, t \vDash \phi for all t in X

Neighborhood semantics date back to the 1960s (Segerberg 1971, Chellas 1980), but since then, they have found many new uses in co-algebraic computation (Hansen, Kupke & Pacuit 2008), refined notions of ‘powers’ for players in games, single or in coalitions, (Pauly 2001), or ‘evidence’ in inquiry, where different neighborhood sets record ‘reasons’ or observations made in the history so far (van Benthem & Pacuit 2011).

In particular, neighborhood models are also a general form of what are called ‘Hyper-graphs’ in mathematics, and as such, they have also been proposed in the recent philosophical literature as a way of modeling so-called {ITALIC:hyper-intensional} notions where standard logical equivalence is replaced by finer sieves for defining propositions.

One important feature shared by these and other generalized semantics for modal logic is a change in appropriate base logics and base languages. What may be an appropriate logical language over some initially studied model class may fail to have enough power of making distinctions over a generalized model class. Modern logic is replete with examples of this phenomenon (Girard 1987, Restal 2000), and modal logic is no exception. We will encounter a concrete illustration in Section 7 below.

What is modal logic? The wealth of theory and applications in modal logic today may seem overwhelming: the 2006 Handbook of Modal Logic runs to some 1200 pages. The question arises: What is truly ‘modal logic’? The themes in this survey give a working answer as an agenda of themes plus a modus operandi, but there are also more mathematical angles. One general abstract approach is in terms of Lindström theorems (van Benthem, ten Cate & Väänänen 2009). The basic modal logic can be shown to be maximal with respect to possessing two major properties from our earlier analysis and from first-order logic in general: invariance for bisimulation, and the compactness theorem. Further results in this vein can help us understand what makes landmark modal systems tick. However, no such results are known yet for the modal fixed-point logics that are so prominent today, and model-theoretic analysis may have to merge with notions from automata theory.

6. Modal Logic and Philosophy Today

With a technical survey like this, the reader may have the impression that modal logic is one of those subjects that started in philosophy, but then went their own way to become independent disciplines. But leaving the nest for good is a rigid biological view of intellectual history. Prodigal sons leave, but also return. Technical modal logic still serves as a laboratory for new notions of interest to philosophers in modal predicate logic (Williamson 2013), and further examples abound: compare (Stalnaker 2006). Moreover, as we saw with strategic reasoning in games, the unity of modal patterns in new application areas provides a new unity all across philosophical logic. Even so, some manifestations of modal logic today seem fossilized remnants, where ‘being philosophical’ means no more than using systems with forbidding names like S4.3 or KD45 whose origins, long ago, had to do with philosophical motivations. But things can be much more lively than this.

A good case for optimism is the interface of modal logic and epistemology. This started in the 1960s with Hintikka’s pioneering work, carried on by Lewis, Stalnaker, and others. Ever after that, the perceived inadequacies of our simple notion of knowledge have dominated discussions of issues such as logical omniscience, and introspection. What happened after is a parting of the ways. Modal logicians found ever more uses of epistemic logics, whether or not their main modality captured the philosophical notion of knowledge. At the same time, philosophers developed interesting new accounts of knowledge undreamt off in the logical tradition. The ‘relevant alternatives theory’ of Dretske 1970, and later de Rose, Lewis, Lawlor, comes with a more dynamic account of choosing relevant spaces of alternative worlds that are essential to knowledge claims. This deeply changes the behavior of basic epistemic reasoning, making for large differences with classical epistemic logic. In an alternative line, Nozick 1981 and later on, Sosa and Roush, have introduced the ‘tracking theory’ of knowledge as true belief that correctly tracks the truth over time, and also counterfactually, in worlds slightly different from the present one. And yet one more rich line is the ‘stability theory’ of knowledge as belief that survives new information or criticism, developed by Lehrer, Stalnaker, Rott, and others. Until the beginning of the twenty-first century, discussions in the philosophical and logical milieus seemed largely disjoint. However, the two streams of thought are approaching. Maintaining relevant alternatives shows clear similarities with the information dynamics discussed earlier. Tracking and stability accounts of knowledge intertwine knowledge, truth, belief, and counterfactuals in intriguing ways, also to logicians. A current wave (let) of publications is bringing the two traditions together (Holliday 2012, Baltag & Smets 2008, Holliday & Perry 2014, Hawke 2015), opening new interfaces for modal logic far beyond the usual laments about the inadequacies of Hintikka’s original system.

And this is just one instance. Contacts between modal logic and philosophy in new modes are very much in evidence in the literature on metaphysics (Zalta 1993, Williamson 2000, 2013, Fine 2002), epistemic modals (expressions like “must”, “may”, “probably”, and so on), where modal logic meets with epistemology and philosophy of language (Swanson 2011, Yalcin 2007, Holliday & Icard 2013, Hawke & Steinert-Steinert 2015). And the same is true for social epistemology, and notions of group knowledge and information dynamics (Helzner and Hendricks 2013, Baltag & Smets 2012, List & Pettit 2002, Christof and Hansen 2015) and the epistemic foundations of game theory (Aumann 1976, Stalnaker 1999). If anything, contacts between modal logic and philosophy are livelier than ever before, though, to see this, one has to look broadly and not seek a monopoly of one favored philosophical interface.

7. Coda: Modal Logic as a Part of Standard Logic

In this article, pains were taken to emphasize that modal logic today in the early twenty-first century is not a sort of intensional epicycle or ornamentation of standard logical systems, but a tool inside the classical realm for analyzing the fine-structure of the rich landscape of systems that span the field of logic today. We have also emphasized that there is no case for opposition or replacement here: instead, we advocated a ‘tandem view’ of having both modal and classical perspectives at our disposal when studying some area of reasoning. A certain flexibility in bringing these to bear, though perhaps looking opportunistic to some, is in fact a hallmark of a creative attitude as a working logician.

But as always in logic, one can keep looking at any topic in different ways. Consider the contrast between ‘poorer’ modal and ‘richer’ classical formalisms. Many people see the business of logic as zooming in on some reasoning practice, supplying more and more details until total clarity and cogency is achieved. This is how one thinks of complete formalizations in the foundations of mathematics, that can be checked by machines. Adding layers of detail and precision is one important use of logic, but there is also an inverse one, consisting rather in zooming out. In the details of some reasoning practice, there may be higher-level patterns that form a simple system of their own that can be brought in the open. Modal logics often have this zooming out character, looking at some simple but very basic patterns of reasoning inside some richer practice: say, the way in which modal logics of space find a decidable core theory inside all the reasoning that goes on in a topology textbook. These dual skills of zooming in and zooming out seem equally important to logic, and modal logic seems a powerful tool in achieving it.

And here is one more dual view on what a modal analysis achieves. In this article, we have stressed how modal languages translate into fragments of classical languages. But as we shall see in a moment, a simple modal semantics for these fragments often suggests a generalized semantics for the complete language, yielding intriguing trade-offs between viewing modal laws as standard validities for some small part of classical first-order logic, or as the complete set of validities for a generalized view of what the full first-order language is about. While this may sound rather technical, the actual contemporary subtlety found in studies of logical systems is the best fuel for a practice-based philosophy of logic.

In addition to these general perspectives, modal logic and classical logic also interact in the form of unusual mixes. We end with two examples that may surprise the reader.

Modal foundations of predicate logic Predicate logic itself is a form of modal or dynamic logic. The key truth condition for the standard existential quantifier reads:

\mathbf{M}, s \vDash \exists x\phi iff there exists an object d in D^{M} with \mathbf{M}, s[x:=d] \vDash \phi

This clearly has a modal pattern for evaluating an existential modality:

\mathbf{M}, s \vDash \exists x\phi iff there exists t with R^{x}st and \mathbf{M}, t \vDash \phi

where we now think of the points s as states of some semantic evaluation process.

Viewed in this light, the usual laws of first-order logic are deconstructed into several layers. The ‘decidable core’ is the minimal modal logic, containing practically important ubiquitous laws such as Monotonicity:

    \[\forall x(\phi \rightarrow \psi) \rightarrow (\forall x\phi \rightarrow \forall xy)\]

This level makes no presuppositions whatsoever concerning the form of the models: they could have any kind of ‘states’ and ‘variable shift relations’ R^{x}. Next, there are laws recording effects of taking states to be concrete variable assignments, connected by a special shift relation of ‘agreeing up to the value for x’. For instance,

    \[\forall x\phi \rightarrow \forall x\forall x\phi\]

expresses the transitivity of R^{x}: indeed, all of S5 holds. Finally, more specifically than these first two layers, some first-order laws express existence properties that demand richness of the universe of available states. As an example, the innocent-looking law:

    \[\exists x\forall y \phi \rightarrow \forall y \exists x \phi\]

expresses confluence: if s R^{x} t and s R^{y} u, there also exists a state v with t R^{y} v and u R^{x} v. When pictured, this is a grid property as discussed before with combinations of modal logics, and indeed, it is at this third level that the undecidability of first-order logic arises. Thus, modal analysis reveals unexpected ‘fine-structure’ in the class of what is usually lumped together as ‘standard validities’: they are valid for different reasons.

We also see another earlier phenomenon exemplified: generalized semantics supports richer languages. On our general modal models, the first-order language gets increased expressive power, since new distinctions come up. In particular, polyadic quantifiers • introducing two objects simultaneously now become different from two-step iterations \exists x\exists y• or \exists y\exists x•. Summing up, in a modal perspective, we get an unorthodox view that shifts the border line of basic logic. The (modal) core of standard first-order logic is decidable, just as Leibniz already thought – but piling up special (existential) conditions makes state sets behave so much like full function spaces D^{VAR} that their logic becomes undecidable, since it now encodes the mathematics of such spaces. For much more on these modal foundations of predicate logic, see (van Benthem 1996).

Dynamic predicate logic Another new view on first-order logic emphasizes the intuitive state change implicit in evaluating an existential quantifier. The ‘dynamic semantics’ of (Groenendijk & Stokhof 1991) makes this explicit. Success is a move to a new state containing a suitable witness value for x that makes the formula true. More generally, one can then let first-order formulas denote actions of evaluation: (a) atomic formulas are ‘tests’ if the current state satisfies the relevant fact, (b) an existential quantifier picks an object and assigns it to x (‘random assignment’), (c) a substitution operator [t/x] is a ‘definite assignment’ x:=t, (d) a conjunction is sequential action ‘composition’, and (e) a negation \neg \phi is a test for the ‘impossibility’ of successfully executing the action \phi.

The resulting ‘dynamified’ first-order logic has applications in the semantics of natural language, since pronouns “he”, “she”, “it” show this kind of dynamic behavior. One nice illustration occurs with sentences like:

    \[\exists x Kx \rightarrow Hx\]

(“if you get a kick, it hurts”). Standard folklore ‘improves’ natural language here to a first-order form:

    \[\forall x (Kx \rightarrow Hx)\]

But with dynamic semantics, this meaning arises automatically for the above surface form, as any value assigned by the existential move in the antecedent will be bound to x when the consequent is processed. The system has also inspired programming languages for dynamic execution of specifications. ‘Dynamic predicate logic’ is a general paradigm for bringing out the cognitive dynamics that underlies existing logical systems. This allows one to view natural language meanings in terms of updates of propositional content, perspective, and other parameters that determine the transfer of information.

The reader should have no difficulty seeing that there is again an underlying modal logic, this time related to the dynamic logic of programs discussed earlier in this article (van Eijck & de Vries 1992, Muskens, van Benthem & Visser 1997).

8. Conclusion

We have discussed modal logic as lying at a crossroads of many disciplines, though we have tried to maintain the original philosophical connections, and also pointed at some promising trends reviving that particular interface. The resulting presentation is different in spirit from other surveys in current anthologies, handbooks, and encyclopedias. We presented modal logic as a tool for fine-structure analysis of expressiveness and complexity of logical systems, including the sometimes surprising effects of their combinations, and we emphasized the major application areas (information, computation, action, agency) that drive abstract theory today. As a result, we had no uniform conclusion, or definition of modal logic to offer in the end: the field seems too rich for that. Our purpose with this panorama will have been served if the reader experiences a beneficial culture shock.

9. References and Further Reading

  • S. Abramsky, D. Gabbay & T. Maibaum, eds., 1992, Handbook of Logic in Computer Science, Oxford University Press, Oxford.
  • M. Aiello, I. Pratt & J. van Benthem, eds., 2007, Handbook of Spatial Logics, Springer Science Publishers, Heidelberg.
  • H. Andréka, I. Nemeti & J. van Benthem, 1998, ‘Modal Languages and Bounded Fragments of Predicate Logic’, Journal of Philosophical Logic 27, 217–274.
  • H. Andréka, I. Németi & I. Sain, 2003, ‘Algebraic Logic’, in Handbook of Philosophical Logic.
  • C. Areces & B. ten Cate, 2006, ‘Hybrid Logics’, In P. Blackburn et al. eds., Handbook of Modal Logic, Elsevier, Amsterdam.
  • S. Artemov, 2006, ‘Modal Logic and Mathematics’, in P. Blackburn et al., eds. Handbook of Modal Logic, 927–970.
  • R. Aumann, 1976, ‘Agreeing to Disagree’, The Annals of Statistics 4:6, 1236–1239.
  • F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, & P. F. Patel-Schneider, eds., 2003, The Description Logic Handbook: Theory, Implementation, Applications. Cambridge University Press, Cambridge.
  • A. Baltag, L. Moss & S. Solecki, 1998, ‘The Logic of Public Announcements, Common Knowledge and Private Suspicions’, Proceedings TARK 1998, 43–56, Morgan Kaufmann Publishers, Los Altos.
  • A. Baltag & S. Smets, 2008, ‘A Qualitative Theory of Dynamic Interactive Belief Revision’, in G. Bonanno, W. van der Hoek, M. Wooldridge, eds., Texts in Logic and Games Vol. 3, Amsterdam University Press, 9–58.
  • A. Baltag & S. Smets, 2012, Interactive Learning, Formal Social Epistemology and Group Belief Dynamics: Logical, Probabilistic and Game-theoretic Models, Lecture Notes, ESSLLI Summer School, Opole.
  • J. Barwise & J. van Benthem, 1999, ‘Interpolation, Preservation & Pebble Games’, Journal of Symbolic Logic 64, 881–903.
  • P. Battigalli & G. Bonanno, 1999, ‘Recent Results on Belief, Knowledge and the Epistemic Foundations of Game Theory’, Research in Economics 53, 149–225.
  • N. Belnap, M. Perloff & M. Xu, 2001, Facing the Future, Oxford Univ. Press, Oxford.
  • J. van Benthem, 1983, The Logic of Time, Kluwer, Dordrecht. J. van Benthem, 1984, ‘Correspondence Theory’, in D. Gabbay & F. Guenthner, eds., Volume III, 167–247.
  • J. van Benthem, 1996, Exploring Logical Dynamics, CSLI Publications, Stanford. J. van Benthem, 2010, Modal Logic for Open Minds, CSLI Publications, Stanford.
  • J. van Benthem, 2011, Logical Dynamics of Information and Interaction, Cambridge University Press, Cambridge.
  • J. van Benthem, 2014, Logic in Games, The MIT Press, Cambridge (Mass.).
  • J. van Benthem & P. Blackburn, 2006, ‘Modal Logic, A Semantic Perspective’, In P. Blackburn et al. eds.. 2006, 1–84.
  • J. van Benthem, B. ten Cate & J. Väänänen, 2009, ‘Lindström Theorems for Fragments of First-Order Logic’, Logical Methods in Computer Science 5:3, 1–27.
  • J. van Benthem & E. Pacuit, 2006, ‘The Tree of Knowledge in Action’, Proceedings Advances in Modal Logic, ANU Melbourne.
  • J. van Benthem & E. Pacuit, 2011, ‘Dynamic Logic of Evidence-Based Beliefs’, Studia Logica 99:1, 61–92. P. Blackburn,
  • J. van Benthem & F. Wolter, eds., Handbook of Modal Logic, Elsevier Science Publishers, Amsterdam.
  • P. Blackburn & W. Meyer Viol, 1994, ‘Linguistics, Logic, and Finite Trees’, Logic Journal of the IGPL 2, 3–29.
  • P. Blackburn, M. de Rijke & Y. Venema, 2001, Modal Logic, Cambridge University Press, Cambridge.
  • P. Blackburn & J. Seligman, 1995, ‘Hybrid Languages’, Journal of Logic, Language and Information 4, 251-272.
  • G. Boolos, 1993, The Logic of Provability, Cambridge University Press, Cambridge.
  • J. Burgess, 1981, ‘Quick Completeness Proofs for some Logics of Conditionals’, Notre Dame Journal of Formal Logic 22:1, 76–84.
  • S. Buvac & I. Mason, 1994, ‘Propositional Logic of Context’, Proceedings AAAI, 412–419. R. Carnap, 1947, Meaning and Necessity, The University of Chicago Press, Chicago.
  • B. ten Cate, 2005, Model Theory for Extended Modal Languages, Ph.D. Thesis, University of Amsterdam. ILLC Dissertation Series DS-2005-01.
  • B. ten Cate & M. Marx, 2009, ‘Axiomatizing the Logical Core of XPath 2.0’, Theory Comput. Syst. 44(4): 561–589.
  • A. Chagrov & M. Zakharyashev, 1996, Modal Logic, Clarendon Press, Oxford. B. Chellas, 1980, Modal Logic, An Introduction, Cambridge University Press, Cambridge.
  • Z. Christof & J-U Hansen, 2015, ‘A Logic for Diffusion in Social Networks’, Journal of Applied Logic 13, 48–77.
  • H. van Ditmarsch, W. van der Hoek & B. Kooi, 2007, Dynamic Epistemic Logic, Springer Science Publishers, Heidelberg.
  • F. Dretske, 1970, ‘Epistemic Operators’, The Journal of Philosophy, 67, 1007–1023.
  • H. D. Ebbinghaus & J. Flum, 1995, Finite Model Theory, Springer, Heidelberg.
  • J. van Eijck & F-J de Vries, 1992, Dynamic Interpretation and Hoare Deduction’, Journal of Logic, Language and Information 1, 1–44.
  • R. Fagin, J. Halpern, Y. Moses & M. Vardi, 1995, Reasoning About Knowledge, The MIT Press, Cambridge (Mass.).
  • K. Fine, 2002, The Limits of Abstraction, Oxford University Press, Oxford. G. Frege, 1879, Begriffsschrift. Eine der Arithmetschen Nachgebildeten Formelsprache des Reinen Denkens, Louis Seifert Verlag, Halle.
  • D. Gabbay, 1996, ‘Fibred Semantics and the Weaving of Logics Part 1: Modal and Intuitionistic Logics’, Journal of Symbolic Logic 61, 1057–1120.
  • D. Gabbay & F. Guenthner, eds., 1981, Handbook of Philosophical Logic, four volumes, Kluwer, Dordrecht. Revised and expanded version appeared from 2001 onward with Springer Science Publishers.
  • D. Gabbay, Ch. Hogger & J. Robinson, eds., 1997, Handbook of Logic in Artificial Intelligence and Logic Programming, Oxford University Press, Oxford.
  • D. Gabbay, A. Kurucz, F. Wolter & M. Zakharyaschev, 2007, Many-Dimensional Modal Logics: Theory and Applications, Elsevier, Amsterdam.
  • D. Gabbay, V. Shehtman, D. Skvortsov, to appear, Quantification in Nonclassical Logic, King’s College London & Moscow University of Humanities.
  • P. Gärdenfors & H. Rott, 1995, ‘Belief Revision’, in D. M. Gabbay, C. J. Hogger & J. A. Robinson, eds., Handbook of Logic in Artificial Intelligence and Logic Programming 4, Oxford University Press, Oxford.
  • N. Gierasimczuk, 2010, Knowing One’s Limits, Logical Analysis of Inductive Inference, Dissertation, Institute for Logic, Language and Computation, University of Amsterdam.
  • J-Y Girard, 1987, ‘Linear Logic’, Theoretical Computer Science 50, 1–102. K. Gödel, 1933, ‘Eine Interpretation des Intuitionistischen Aussagenkalküls’, Ergebnisse eines Mathematischen Kolloquiums 4, 34–38.
  • R. Goldblatt & S. Thomason, 1975, ‘Axiomatic Classes in Propositional Modal Logic’, in J. Crossley, ed., Algebra and Logic, Springer Lecture Notes in Mathematics 450, 163–173.
  • V. Goranko & S. Passy, 1992, Using the Universal Modality: Gains and Questions, Journal of Logic and Computation 2, 5–30.
  • J. Groenendijk & M Stokhof, 1991, ‘Dynamic Predicate Logic’, Linguistics and Philosophy 14, 39–100.
  • D. Grossi, 2010, ‘On the Logic of Argumentation Theory’, in W. van der Hoek et al. eds., Proceedings 9th International Conference on Autonomous Agents and Multiagent Systems, Toronto, 409–416.
  • A. Grove, 1988, ‘Two Modellings for Theory Change’, Journal of Philosophical Logic 17, 157–170.
  • A. Gupta & R. Thomason, 1980, ‘A Theory of Conditionals in the Context of Branching Time’, Philosophical Review 89, 65–90.
  • J. Halpern & M. Vardi, 1989, ‘The Complexity of Reasoning about Knowledge and Time, I: lower bounds’. Journal of Computer and System Sciences, 38(1):195–237.
  • H. Hansen, C. Kupke & E. Pacuit, 2008, ‘Neighbourhood Structures: Bisimilarity and Basic Model Theory’, in D. Kozen, U. Montanari, T. Mossakowski & J. Rutten, eds., Logical Methods in Computer Science 15, 1–38.
  • S. O. Hanson, 2001, ‘Preference Logic’, in D, Gabbay & F. Guenthner, eds., Handbook of Philosophical Logic IV, 319 – 393, Kluwer, Dordrecht.
  • D. Harel, 1985, ‘Recurring Dominoes: Making the Highly Undecidable Highly Understandable’, Annals of Discrete Mathematics 24, 51–72.
  • D. Harel, D. Kozen & J, Tiuryn, 2000, Dynamic Logic, The MIT Press, Cambridge (Mass.). P. Hawke, 2015, Knowledge and Relevance, Ph.D. Thesis, Department of Philosophy, Stanford University.
  • P. Hawke & S. Steinert-Threlkeld, 2015, ‘Informational Dynamics of “Might” Assertions’, Department of Philosophy, Stanford University. Proceedings LORI 2015, Taipei.
  • J. Helzner & V. Hendricks, 2013, Agency and Interaction: What we Are and What we Do in Formal Epistemology, Cambridge University Press, Cambridge.
  • R. Hilpinen, ed., 1970, Deontic Logic: Introductory and Systematic Readings, Reidel, Dordrecht.
  • R. Hilpinen, ed., 1981, New Studies in Deontic Logic, Reidel, Dordrecht.
  • J. Hintikka, 1962, Knowledge and Belief, Cornell University Press, Ithaca.
  • W. Holliday, 2012, Knowing What Follows, Epistemic Closure and Epistemic Logic, Dissertation, Department of Philosophy, Stanford University.
  • W. Holliday & Th. Icard, 2013, ‘Logic, Probability and Epistemic Modality’, Departments of Philosophy, Berkeley and Stanford.
  • W. Holliday & J. Perry, 2013, ‘Roles, Rigidity and Quantification in Epistemic Logic’, Departments of Philosophy, Berkeley and Stanford.
  • G. Hughes & M.J. Cresswell, 1969, An Introduction to Modal Logic, Methuen, London. B. Jónsson & A. Tarski, 1951, ‘Boolean Algebras with Operators’, Parts MI Amer. J. Math. 73, 74, 891–939, 127–162.
  • S. Kanger, 1957, Provability in Logic. Almqvist & Wiksell, Stockholm.
  • K. Kelly, 1996, The Logic of Reliable Inquiry, Oxford University Press, Oxford.
  • W. & M. Kneale, 1961, The Development of Logic, Oxford University Press, Oxford.
  • S. Kripke, 1959, ‘A Completeness Theorem in Modal Logic’, The Journal of Symbolic Logic, 24, 1–14.
  • S. Kripke, 1963, ‘Semantical Considerations on Modal Logic’, Acta Philosophica Fennica 16, 83–94.
  • S. Kripke, 1965, ‘Semantical Analysis of Intuitionistic Logic’, in J. Crossley and M. A. E. Dummett, eds., Formal Systems and Recursive Functions, North-Holland, Amsterdam, 92–130.
  • S. Kripke, 1980, Naming and Necessity. Harvard University Press, Cambridge (Mass.).
  • A. Kurz, 2001, Coalgebras and Modal Logic, lecture Notes, CWI Amsterdam. J. van Leeuwen, ed., 1991, Handbook of Theoretical Computer Science, North- Holland, Amsterdam.
  • C. I. Lewis & H. Langford, 1932, Symbolic Logic, Dover, New York. K. Leyton-Brown & Y. Shoham, 2008, Essentials of Game Theory: A Concise Multidisciplinary Introduction, Morgan & Claypool Publishers, San Rafael.
  • D. Lewis, 1969, Convention, Harvard University Press, Cambridge (Mass.).
  • D. Lewis, 1973, Counterfactuals, Blackwell, Oxford.
  • L. Libkin, 2012, Elements of Finite Model Theory, Springer, Berlin. Ch. List & Ph. Pettit, 2002, ‘Aggregating Sets of Judgments: An Impossibility Result’, Economics and Philosophy, 18, 89–110.
  • F. Liu, 2011, Reasoning About Preference Dynamics, Springer, Dordrecht. M. Marx, 2006, ‘Complexity of Modal Logic’, in P. Blackburn et al. eds., 139–179.
  • R. Montague, 1974, Formal Philosophy, Yale University Press, New Haven.
  • R. Muskens, J. van Benthem & A. Visser, 1997, ‘Dynamics’, in J. van Benthem & A. ter Meulen, eds., Handbook of Logic and Language, Elsevier, Amsterdam.
  • S. Negri, 2011, ‘Proof Theory for Modal Logic’, Philosophy Compass 6/8, 523–538.
  • R. Nozick, 1981, Philosophical Explanations, Harvard University Press, Cambridge (Mass.).
  • M. Osborne & A. Rubinstein, 1994, A Course in Game Theory, The MIT Press, Cambridge (Mass.).
  • A. Palmigiano, W. Conradie, and S. Ghilardi, 2014, ‘Unified Correspondence’, in A. Baltag & S. Smets, eds., Johan van Benthem on Logic and Information Dynamics, Outstanding Contributions to Logic, Springer, Dordrecht, 933–975.
  • M. Pauly, 2001, Logic for Social Software, Dissertation, Institute for Logic, Language and Computation, University of Amsterdam.
  • A. Perea, 2011, Epistemic Game Theory, Cambridge University Press, Cambridge.
  • V. Pratt, 1981, ‘A Decidable Mu Calculus’, Foundations of Computer Science, SFCS 22, 421–427.
  • A. Prior, 1967, Past, Present and Future, Clarendon Press, Oxford.
  • W. Rabinowicz & K. Segerberg, 1994, ‘Actual Truth, Possible Knowledge’, Topoi 13, 101–115.
  • G. Restal, 2000, An Introduction to Substructural Logics, London: Routledge.
  • K. Segerberg, 1971, An Essay in Classical Modal Logic, Filosofiska Institutionen, University of Uppsala.
  • K. Segerberg, 1995, ‘Belief Revision from the Point of View of Doxastic Logic’, Bulletin of the IGPL 3, 534–553.
  • Y. Shoham & K. Leyton-Brown, 2008, Multiagent Systems: Algorithmic, Game Theoretic and Logical Foundations, Cambridge University Press, Cambridge.
  • R. Stalnaker, 1984, Inquiry, The MIT Press, Cambridge, MA. R. Stalnaker, 1999, ‘Extensive and Strategic Form: Games and Models for Games’, Research in Economics 53, 293–291.
  • R. Stalnaker, 2006, ‘On Logics of Knowledge and Belief’, Philosophical Studies 128, 169–199.
  • E. Swanson, 2011, ‘On the Treatment of Incomparability in Ordering Semantics and Premise Semantics’, Journal of Philosophical Logic 40, 693–713.
  • A. Tarski, 1938, ‘Der Aussagenkalkül und die Topologie’, Fundamenta Mathematicae 31, 103–134.
  • A. Troelstra & D. van Dalen, 1988, Foundations of Constructivism, Elsevier, Amsterdam.
  • F. Veltman, 1985, Logics for Conditionals, Dissertation, Philosophical Institute, University of Amsterdam.
  • Y. Venema, 2006, ‘Modal Logic and Algebra’, In Handbook of Modal Logic.
  • Y. Venema, 2007, Lectures on the Modal Mu–Calculus, Institute for Logic, Language and Computation, University of Amsterdam.
  • H. Wansing, ed., 1996, Proof Theory of Modal Logic, Kluwer, Dordrecht.
  • T. Williamson, 2000, Knowledge and its Limits, Oxford University Press, Oxford.
  • T. Williamson, 2013, Modal Logic as Metaphysics, Oxford University Press, Oxford.
  • G. H. von Wright, 1963, The Logic of Preference, Edinburgh University Press, Edinburgh.
  • S. Yalcin, 2007, ‘Epistemic Modals’, Mind 116 (464):983–1026.
  • E. Zalta, 1993, ‘A Philosophical Conception of Propositional Modal Logic’, Philosophical Topics 21:2, 263–281.

Author Information

Johan van Benthem
Email: http://staff.fnwi.uva.nl/j.vanbenthem
University of Amsterdam, Stanford University, and Tsinghua University
The Netherlands, U. S. A., and China

Religious Disagreement

The domain of religious inquiry is characterized by pervasive and seemingly intractable disagreement. Whatever stance one takes on central religious questions—for example, whether God exists, what the nature of God might be, whether the world has a purpose, whether there is life beyond death—one will stand opposed to a large contingent of highly informed and intelligent thinkers. The fact of extensive religious disagreement raises several distinct philosophical questions. One significant question arises within the context of political philosophy: may religious conceptions of the good and the right legitimately ground one’s political convictions in a pluralistic society marked by diverse and often conflicting religious convictions? Other questions concern the possibility of reconciling disagreement data with specific religious beliefs. For example, can persistent religious disagreement be squared with the conviction of many Christians and other theists that God “desires everyone to be saved and to come to knowledge of the truth” (I Timothy 2:4, NRSV)? These and other important questions will not be taken up here. The focus of this article is the epistemic challenge raised by religious disagreement: does awareness of the nature and extent of religious disagreement make it unreasonable to hold confident religious, or explicitly irreligious, views? Many philosophers have answered this question in the affirmative, arguing that the proper response to religious disagreement is religious skepticism. Others contend that religious conviction may be reasonably maintained even in the face of disagreement with highly qualified thinkers.

Reflecting on the epistemic challenge posed by religious disagreement readily leads one to questions concerning the epistemic significance of disagreement in general, religious or otherwise. One might think that religious disagreement does not raise any distinctive epistemological questions beyond those that are addressed in a more general work on disagreement. There are, however, features of religious disagreements that present problems that, for the most part, are not adequately addressed in such a work. These features include the lack of agreement on what skills, virtues, and qualifications are most important for assessing the questions under dispute; the fact that many of the disputed beliefs are arguably epistemically fundamental; the significant evidential weight that is assigned to private experiences; and the prominence of practical or pragmatic considerations in the justifications offered for opposing viewpoints. While these features taken individually may not be exclusive to religious disagreements, the fact that they frequently coincide in religious disputes and are especially salient in such disputes makes religious disagreement a worthy epistemological topic in its own right. The bulk of this article will focus on these problematic features of religious disagreements and the special questions they raise.

Table of Contents

  1. The First-Order and Higher-Order Significance of Religious Disagreement
  2. The Conciliatory Argument for the Higher-Order Defeat of Religious Belief
    1. Strong Conciliatory Policies
    2. Modest Conciliatory Policies
    3. The a Posteriori Stage of the Argument
    4. The Scope of the Conciliatory Argument
  3. Permissivist Responses to the Conciliatory Argument
  4. Religious Belief and the Problem of Judging Epistemic Credentials
  5. Fundamental Versus Superficial Disagreements
  6. Appeals to Religious Experience
  7. Faith and Practical Responses to the Problem of Religious Disagreement
  8. Conclusion
  9. References and Further Reading

1. The First-Order and Higher-Order Significance of Religious Disagreement

Religious disagreement may present two distinct sorts of evidential challenges to a given religious belief: a first-order challenge and higher-order challenge. (Henceforth, the label “religious belief” will typically be used to refer to all beliefs that take a stand on religious questions, including explicitly irreligious beliefs such as the belief that there is no God.) The aim of this section is to clarify the distinction between first-order and higher-order evidential challenges and to look at examples of how religious disagreement may possess first-order significance for religious belief. The remaining sections will focus on the higher-order challenge posed by religious disagreement.

In the epistemological literature on disagreement, a contrast is frequently drawn between first-order and higher-order evidence (Kelly 2005). The distinction may roughly be characterized as follows. First-order evidence for or against some proposition p “directly” bears on the question of whether p, whereas higher-order evidence for or against p does not directly bear on the question of whether p but directly bears on the question of whether one has rationally assessed the first-order evidence for or against p. To illustrate the distinction, consider the case (from Rotondo 2013) of Detective, who has stayed up all night studying the evidence bearing on a particular crime. At the end of the lengthy process of sifting the evidence, Detective judges that it is very likely that Lefty, rather than Righty, committed the crime. When she calls Lieutenant to share her conclusion, Lieutenant asks whether Detective has stayed up all night and then informs Detective that every time Detective has stayed up all night in the past, her reasoning has been atrocious and unreliable (despite its seeming to Detective that nothing is amiss). Let’s call this fact that Detective has a bad track record after all-nighters UNRELIABLE. According to many epistemologists, upon learning UNRELIABLE, Detective ought to become significantly less confident that Lefty committed the crime. (Still, there are others [Lasonen-Aarnio 2014 and Titelbaum 2015] who argue that Detective should not reduce confidence if she in fact assessed the first-order evidence correctly.) Thus, UNRELIABLE is arguably evidence against Detective’s proposition that Lefty committed the crime. However, UNRELIABLE does not directly bear on the question of whether Lefty committed the crime in the same way that the evidence Detective stayed up examining does. It is not as though UNRELIABLE is more to be expected if Righty committed the crime than if Lefty committed the crime: someone who had a full night’s sleep before examining the evidence inspected by Detective could dismiss UNRELIABLE as evidentially irrelevant. If UNRELIABLE gives Detective reason to doubt the Lefty hypothesis, it is only because UNRELIABLE is higher-order evidence that raises doubts about any conclusion that Detective might have reached after an all-nighter.

Facts about religious disagreement may pose first-order or higher-order evidential worries (or both) for religious belief.

Suppose that religious view R1 suggests a view of human nature where persistent religious disagreement is to be expected, while religious view R2 suggests a view of human nature where persistent religious disagreement would be very surprising (though not impossible). Given this supposition, persistent religious disagreement would constitute first-order evidence in favor of R1 over R2. In addition to this first-order significance of religious disagreement, however, facts about disagreement may constitute higher-order evidence against both religious views if these facts raise worries about the rationality of one’s religious views or the reliability of the process by which one’s religious beliefs have been formed. Alternatively, we can imagine a situation where widespread religious agreement on the truth of R1 provides higher-order support of R1 (by boosting one’s confidence in the general reliability of religious belief formation) even though religious agreement is unexpected given R1, and thus counts as first-order evidence against it. The first-order significance of religious disagreement is thus distinct from its higher-order significance.

An example of an argument in the philosophy of religion that makes claims about the first-order significance of religious disagreement is the argument from divine hiddenness. This argument against theism begins by noting that according to most theists, the highest good for a human being is to be in a loving relationship with God. Many theists also claim that since God loves all human beings, God desires to be in a loving relationship with each human person. If this view is a component of theism, then, given theism, we have reason to expect that God would make God’s existence evident to all—for the lack of belief that God exists is a barrier to the loving relationship that God desires. The fact that many intelligent and thoughtful people fail to believe in God, including many people who indicate they would like to believe in God if it were possible for them to do so, is evidence that God has not made God’s existence very evident, contrary to what theism might lead us to expect. Thus, extensive and pervasive disagreement over whether God exists is claimed to be evidence against theism.

John Hick (2004) offers a very different characterization of the first-order evidential significance of religious disagreement. Rather than suggesting that such disagreement supplies an evidential basis for atheism, Hick suggests that such disagreement can instead be viewed as evidence that genuine encounters with “the Real”—that transcendent reality that is the source of salvation and that is encountered in all of the world’s great religions—are inevitably understood through conceptual frames that prevent unproblematic cognitive access to the Real as it is in itself and lead to diverse, and often conflicting, interpretations of such experiences. This position, which Hick labels “religious pluralism,” is not motivated by intractable religious disagreement alone. Hick emphasizes that the major world religions have all proven to be successful as vehicles that move practitioners from “self-centredness” to “Reality-centredness,” and this ethical parity across multiple faiths is seen by Hick as undermining the basis for thinking that one religious tradition may reasonably claim supremacy in the veridicality of its teaching. However, it is clear that Hick’s pluralism would be unmotivated if it were the case that religious dialogue typically led to an end of religious disagreement and to agreement on the teachings of one religious tradition. Hence, the fact of persistent religious disagreement does play a crucial evidential role in the case for Hick’s pluralistic hypothesis.

The argument for divine hiddenness and the case for religious pluralism can both be understood as appeals to the first-order rather than higher-order significance of religious disagreement. Consider first Hick’s pluralism. Since pluralism is itself a controversial and significantly contested religious viewpoint, the higher-order worries raised by disagreement as discussed below would seem, at least initially, to apply as much to the belief in pluralism as to other religious convictions. Considered as a piece of first-order evidence, however, religious disagreement does lend more support to religious pluralism than to many other religious hypotheses.  If culturally-conditioned interpretive frameworks are as entrenched and significant as pluralists contend, then we should expect religious conversion to be fairly rare and religious disagreement to be rather intractable, as is in fact the case. Many non-pluralist religious perspectives will have a harder time accommodating this datum. Similarly, the argument from divine hiddenness is clearly a first-order rather than higher-order challenge to theistic belief. Even if religious disagreement did not pose a higher-order challenge to the theist, the fact of significant and persistent disagreement over theism could still be first-order evidence against theism. For example, even if a theist somehow knew that she and her fellow theists were in possession of more evidence than non-theists, so that the disagreement over theism did not give her any reason for questioning whether or not she and other theists have made some error in their assessment of the evidence, the fact that many reject theism, even if due to lack of information, would still constitute evidence against theism since prevalent disbelief is more to be expected given atheism than theism.

2. The Conciliatory Argument for the Higher-Order Defeat of Religious Belief

We turn now from the first-order significance of religious disagreement to an argument for the claim that religious disagreement constitutes higher-order evidence that renders religious belief (or at least many religious beliefs) unjustified—that is, that religious disagreement constitutes a higher-order defeater for religious belief. The argument for this conclusion can be seen as consisting of two components: one a priori and the other a posteriori. The a priori component aims to defend some general “conciliatory” policy that says that in disagreements that satisfy certain conditions, the proper response is a conciliatory one that gives significant weight to the views of one’s disputants. This conciliatory response might involve giving up one’s belief and adopting an agnostic stance towards the question under dispute. Or, if someone’s doxastic state is better described not in terms of belief or unbelief, but in terms of subjective probabilities or “credences,” then the appropriate conciliatory response might involve adopting a new credence for the disputed proposition that gives significant weight to the initial credences of one’s disputants. The a posteriori component of the argument aims to show that the core commitments of religious believers are in fact subject to the relevant type of disagreement—a disagreement where the aforementioned conciliatory principle requires significant reduction in confidence.

a. Strong Conciliatory Policies

The a priori stage of the argument defends some conciliatory policy that is demanding enough to require significant reduction of confidence in religious disagreements. There is no canonical conciliatory policy that is agreed upon by those who argue that disagreement has significant higher-order force, but a variety of conciliatory requirements have been proposed, some more demanding than others. Despite the diversity of conciliatory proposals, one can discern behind the most demanding conciliatory views two basic commitments (Vavova 2014). The first is a principle that requires epistemic deference to other thinkers in proportion to their apparent epistemic qualifications, and the second is a principle that constrains the types of reasons to which one can legitimately appeal when assessing the relative epistemic qualifications of the various sides of the dispute. It is worth separating these principles out, since some criticisms of the most demanding conciliatory views can be understood as targeting the first principle, while others, the second. We might articulate these principles as follows:

DEFERENCE: In a disagreement over p, one ought to show epistemic deference to suitably qualified thinkers, giving equal weight to one’s own view and the view of an apparent epistemic peer (where an “epistemic peer” is someone whose epistemic credentials with respect to p are equal to one’s own) and more weight to the view of someone who appears to be one’s epistemic superior.

INDEPENDENCE: In assessing my own and my disputant’s epistemic credentials with respect to p, in order to determine how (or whether) to modify my own belief about p, I should base my assessment on grounds that are independent of my disputed reasoning concerning p. (Adapted from Christensen 2011.)

Consider DEFERENCE first. The basic idea behind DEFERENCE is that one cannot reasonably maintain confident belief that p while thinking that those who reject p are just as qualified and well-positioned to assess the plausibility of p as those on one’s own side of the dispute. For example, according to DEFERENCE, someone who believes that Muhammad is a prophet of God cannot reasonably think that those who reject this claim are, taken as a whole, just as qualified to assess the claim as those who accept it. Whether DEFERENCE is plausible partly depends on what is meant by “epistemic credentials.” The principle is plausible only if our understanding of epistemic credentials is such that we should expect those who are more credentialed to be more reliable in their views on the disputed question than those who are less credentialed. This means that epistemic credentials must be assessed relative to the particular proposition under dispute and the particular occasion when the proposition was assessed. Furthermore, the relevant credentials must take into account all of the dimensions of epistemic evaluation that bear on the reliability of one’s judgment on the matter at hand, such as the quality and quantity of one’s evidence and the ability to assess that evidence in a rational and unbiased manner. This understanding of epistemic credentials may not align with conventional notions of expertise. For example, someone who is a leading researcher on a contested scientific question might count as less credentialed than a well-informed non-specialist if there is concern that the researcher’s personal involvement has biased her judgment.

Given a sufficiently fine-grained understanding of epistemic credentials, DEFERENCE looks very plausible. Those who reject DEFERENCE must affirm that there are situations where I can reasonably stand by my view that p even though I acknowledge that my epistemic position with respect to p is no stronger than that enjoyed by my interlocutor who denies p. It seems that in such a situation, I would need to hold that, despite our equally strong epistemic positions, I was simply lucky in arriving at the truth and my interlocutor was not. Many writing on disagreement seem to take it for granted that this would not be a rational position. Even those who oppose demanding conciliatory views typically hold that in order to dismiss the worries raised by disagreement, it is necessary to identify a “symmetry breaker” (Lackey 2010), or some reason for thinking that one’s own side is better placed to assess the disputed matter than the opposition. In the next section, we consider one response to disagreement-motivated religious skepticism that involves rejecting DEFERENCE.

INDEPENDENCE is the more controversial of the two conciliatory principles offered above. Christensen (2009), who is responsible for labeling the principle, argues that INDEPENDENCE is the key principle separating “conciliationists” and their opponents. According to Christensen, INDEPENDENCE captures what is wrong with “blatantly question-begging” responses to disagreement (2011). He gives an example where two individuals who are sharing a dinner at a restaurant with several friends both calculate in their head what each person’s share of the total bill comes to (Christensen 2007). They agree to add 20% of the post-tax total for tip and to split the check evenly among each member of the party. Both friends do this sort of calculation often and know that the other person is no more or less reliable than they are. They usually agree on the answer in such cases, but in those instances when they do reach different answers, neither of them has proven more likely than the other to be the one who has made an error. While nothing is out of the ordinary in this case (for example, neither friend is especially distracted or extra alert), upon finishing their mental calculations they discover that their answers differ: one arrived at an answer of $43, and the other, $45. According to Christensen (2011), INDEPENDENCE is needed to explain why it would be illegitimate for one of the two friends to dismiss the significance of the disagreement by reasoning as follows: “Since my friend fails to see that the facts support an answer of $43, I have good reason for thinking that, contrary to my expectations, she is not (at least at this moment) a reliable judge of the question we are disputing; therefore, her disagreement gives me no reason at all to question my answer.”

Most agree that this sort of response to the disagreement is unreasonably sanguine. But it is questionable whether a principle as strong as INDEPENDENCE is needed to explain why this response is unreasonable (Kelly 2013). Note that in the present example, the speaker did not have any reason to be dismissive of his friend’s view antecedent to learning that she arrived at a different answer. His only reason for judging that she was unreliable was the fact that her answer differed from his. Such crudely dogmatic reasoning, if acceptable, could be used to dismiss as misleading any piece of evidence that goes against what one believes. This is quite different from a situation where someone’s belief that p gives him a reason for thinking, even before learning what his friend thinks about p, that his friend is unlikely to be reliable in her judgment concerning p. Consider, for instance, a situation where someone comes to believe in a religion that teaches that the wealthy are frequently biased when it comes to assessing spiritual questions. Such a person has a reason for distrusting his wealthy friend’s opinion concerning the religion even before learning what her opinion is. Suppose the new convert learns that his wealthy friend does reject the new religion as false, but the convert is largely unworried by the disagreement since his new religion teaches that the wealthy are biased on such matters. While this dismissal appeals to dispute-dependent reasons and thus violates INDEPENDENCE, the dismissal would not be based on the crude sort of dogmatic reasoning that would always be available in any dispute. It is at least less obvious in this case that the dispute-dependent reasoning is objectionable. This provides some reason for doubting whether a strong anti-question-begging principle like INDEPENDENCE is needed in order to explain why the quick dismissal in the calculation case is problematic.

b. Modest Conciliatory Policies

The last section focused on the most demanding sort of conciliatory policy, which features both DEFERENCE and INDEPENDENCE. But many proponents of broadly conciliatory views advocate less demanding policies that feature weaker principles than these. In particular, many seek to avoid some of the radical implications that are thought to follow from INDEPENDENCE, opting instead for a weaker anti-question-begging constraint. To see why INDEPENDENCE taken as a general requirement is thought to support implausibly demanding prescriptions, suppose you find yourself in a disagreement with a radical skeptic who believes that human cognitive faculties, including those employed in philosophical reasoning, are systematically unreliable. You might have many reasons for thinking that this skeptic is not particularly qualified with respect to philosophical matters. Perhaps he has not read any academic philosophy and succumbs to several logical fallacies in his argumentation. Still, these reasons for putting little trust in your interlocutor would not be dispute-independent, since you should not trust your ability to judge epistemic credentials if you took seriously the skeptic’s view that your cognitive faculties are systematically unreliable. It would seem, then, that in this context, you cannot have dispute-independent reasons for thinking that you are more qualified than your disputant. Of course, you also cannot have dispute-independent reasons for thinking that your disputant is more qualified. Nonetheless, a lack of dispute-independent reasons for favoring either side may itself be considered a dispute-independent reason for having an equal estimation of the epistemic credentials of the two sides. If this is right, then INDEPENDENCE, in combination with DEFERENCE, would seem to require that you give up believing in the reliability of your cognitive faculties. Since most do not think that we must have non-question-begging reasons for rejecting skepticism, even when we encounter a real skeptic, many advocates of conciliatory views aim to articulate an anti-question-begging constraint that is less absolute than INDEPENDENCE. Conciliatory views that feature a weaker anti-question-begging requirement may nonetheless be sufficiently powerful to undermine religious beliefs.

One example of a weaker anti-question-begging requirement is that proposed by Schellenberg (2007, 171). Schellenberg allows that the need to avoid the most general skepticism warrants trusting those belief-forming mechanisms that are (nearly) universal and unavoidable, even when there are no non-question-begging reasons for such trust. This would explain why we need not capitulate in disagreements with an isolated skeptic who doubts the reliability of our perceptual, memorial, and/or inferential faculties. According to Schellenberg, however, a mechanism that is neither universal nor unavoidable should not be trusted in the absence of independent grounds for thinking that it is reliable. Since he maintains that religious belief-formation is neither universal nor unavoidable, and since it is not possible in the current context of religious controversy to give non-question-begging grounds for taking some particular mechanism of religious belief formation to be reliable, Schellenberg concludes that religious skepticism is the only rational option.

Alston (1991, 198-9) has contested the claim that non-universal belief-forming mechanisms should be held to higher epistemic standards than universal mechanisms, claiming that there is no reason to suppose that all mechanisms worthy of default trust will be common to all or most mature adults. Additionally, Schellenberg’s criteria appear to have consequences that many would find dubious.

Consider the example (adapted from Plantinga [2000, 450]) of someone in colonial America who is strongly inclined towards the view that chattel slavery is morally abhorrent, but who is not unavoidably drawn to this conclusion. Schellenberg’s criteria seem to imply that such a person cannot rationally judge slavery to be morally abhorrent unless she can cite dispute-independent reasons for thinking her own moral views are more likely to be reliable than the majority position. This is an unpalatable conclusion, since it is questionable whether such dispute-independent reasons could be identified.

Another attempt to articulate a more qualified constraint on question-begging is that supplied by Christensen (2011). Christensen acknowledges that in a dispute with a radical skeptic (or with someone else who questions all of the beliefs we rely on to assess epistemic credentials), we lack a dispute-independent reason for thinking that we are more qualified than our disputant. Nevertheless, he argues that the mere absence of dispute-independent reasons for favoring one’s own side is not enough for disagreement to pose legitimate skeptical worries. Nor will disagreements pose serious skeptical worries in cases where a dispute-independent evaluation produces only a very weak reason for thinking that the credentials of one’s disputant rival or surpass one’s own. On the contrary, significant conciliation will typically be required only when there are sufficiently strong positive dispute-independent grounds for trusting the other side. Christensen’s bill tabulation case, where the disagreeing friends have significant track record data that suggest that they are equally reliable at mental math, is an example where the disputants have strong dispute-independent reasons for taking the other person to be an epistemic peer, resulting in significant conciliatory pressure. In contrast, consider a disagreement between two philosophers who systematically disagree across a wide range of ethical questions. While these philosophers might acknowledge that they are both comparable in intelligence, degree of philosophical training, and general intellectual virtue, this sort of parity provides a much weaker reason (in comparison to the solid track record data of the first case) for thinking that the other person is an approximate peer. Christensen’s view would therefore suggest that the conciliatory pressure is weaker in this latter case. The significance of Christensen’s moderate conciliationism for religious belief is discussed in section 4.

c. The a Posteriori Stage of the Argument

No plausible conciliatory policy will require giving up religious belief in the face of just any disagreement. Plausible policies will require religious skepticism only if one’s religious beliefs are contested by a sufficiently large and qualified contingent of individuals. The full argument that religious disagreement defeats some religious belief must do more than merely assert that the belief is contested; it must assert that the degree of dissent is significant enough that the correct view on disagreement will require abandoning the belief. This is the a posteriori stage of the argument, which has received little attention despite the fact that it is far from trivial. Consider the evidential implications of the distribution of opinion concerning the existence of God. A 2010 poll of over 18,000 adults conducted by Ipsos in 23 countries found that 51% of respondents reported believing in at least one God or “Supreme Being,” 17% reported sometimes believing and sometimes not believing, 13% reported not being sure if they believe, and 18% indicated that they do not believe in any sort of divine being. These percentages vary only slightly across different levels of education. The epistemic import of this data is far from clear. Kelly (2011) suggests that the fact that theistic belief is significantly more prevalent than atheistic belief constitutes evidence that at least slightly supports theism. But some proponents of religious skepticism may argue that the exact proportions are not very significant, and that what is epistemically significant is the lack of consensus. Additionally, many hold that the beliefs in the overall population are far less significant than the beliefs of those with relevant expertise. And atheism is the dominant position in certain communities of experts. For example, a large 2009 survey of professional philosophers conducted by Bourget and Chalmers at 99 leading departments found that 73% of professional philosophers accepted or leaned towards atheism, while only 15% accepted or leaned towards theism. On the other hand, most philosophers specializing in philosophy of religion were theists, with 72% accepting or leaning towards theism and only 19% accepting or leaning towards atheism. Draper and Nichols (2013) argue that the specialists in philosophy of religion are significantly influenced by pro-religious biases, a claim which, if true, would perhaps significantly diminish the epistemic significance of the prominence of theistic belief among philosophers of religion. No doubt some theists would counter that certain selection effects and anti-theistic biases in the wider professional culture of philosophy help to explain the prominence of atheism among philosophers as a whole. In any case, many religious believers would contest the notion that philosophical expertise is the most important qualification for reliably evaluating religious questions (see section 4). Clearly, several delicate and contentious questions must be addressed by anyone attempting to measure the degree of qualified opposition to a given religious perspective.

d. The Scope of the Conciliatory Argument

Some proponents of disagreement-motivated religious skepticism target any confident view on contentious religious questions, including those atheistic perspectives that would not typically be labeled “religious.” Others argue that religious disagreement defeats the justification of explicitly religious worldviews, but do not think that secular atheism is similarly threatened. According to Kitcher (2014, 7), “the religious convictions of many contemporary believers are formed in very much the same ways,” namely, through trusting a religious community that claims to preserve and pass on the teachings of prophets and mystics who had some special connection with God or the transcendent. Since this process leads to incompatible beliefs when it is employed in different cultural contexts, the process is unreliable and cannot justifiably be trusted. Because Kitcher does not think that religious disagreement undermines secular atheism—indeed, he appeals to religious disagreement precisely to motivate a “soft atheism” that makes “small concessions in the direction of agnosticism” (23-4)—he presumably thinks that acceptance of secular atheism is not based on a process of communal trust that is relevantly like the unreliable sort of trust that typically grounds religious conviction. Thus, religious disagreement defeats those beliefs typically labeled “religious,” but does not defeat secular convictions.

A significant problem for this more narrowly targeted defeat argument is that many religious believers will deny that the process by which they hold their convictions is accurately characterized as one of uncritically trusting their religious community. They may acknowledge that unreflective trust of a religious community is unjustified (in light of religious diversity), but insist that the process by which they hold their beliefs is one of critically reflecting on all of the evidence. This evidence includes their personal experience and community’s experience, to be sure, but also testimonial evidence from other communities, scientific evidence, and philosophical considerations. Kitcher anticipates a response along these lines, but insists that even when we confine our focus to reflective and philosophically sophisticated religious believers, we still find a substantial amount of controversy, and for this reason he thinks that even philosophically reflective religious belief is defeated by disagreement (9-10). What is unclear, however, is why Kitcher thinks that an irreligious secular outlook can avoid epistemic defeat when the process that presumably accounts for his secular convictions—namely, the process of critical philosophical examination of all the relevant evidence—is a process that appears to lead many thinkers to conclusions that are incompatible with his secularism. If one insists that religious disagreement defeats even reflective religious belief, it will be difficult to maintain that explicitly irreligious belief is not similarly threatened (Bogardus 2013a, 390).

3. Permissivist Responses to the Conciliatory Argument

As discussed in the last section, the conciliatory views that lie behind arguments for disagreement-based religious skepticism can often be understood as consisting of two commitments: a commitment to a principle like DEFERENCE that requires epistemic deference to apparently qualified interlocutors, and a commitment to a principle like INDEPENDENCE that prohibits a question-begging assessment of epistemic qualifications. While many responses to arguments for disagreement-based religious skepticism take issue with INDEPENDENCE, some target DEFERENCE. These responses are the focus of this section.

Critics of DEFERENCE maintain that its plausibility depends on an unacceptably restrictive conception of rationality according to which a given body of evidence rationalizes at most one doxastic attitude towards any given proposition (Schoenfield 2014). On this restrictive picture, if two agents A and B have exactly the same evidence bearing on p and are both perfectly rational in responding to that evidence, then A and B will have the same view on p’s plausibility. This thesis is frequently called the “uniqueness thesis” (Feldman 2007), since it holds that there is a uniquely rational doxastic response to any particular body of evidence. Critics of DEFERENCE deny uniqueness and maintain that, in at least some contexts, there is no single response to a given body of evidence that is maximally rational. Religious questions are often cited as one context where rationality is “permissive” in this way. Surely, these “permissivists” maintain, there is no single credence for, say, God’s existence that stands alone as the maximally rational response to a given body of evidence.

The permissivist objection to the applicability of DEFERENCE in religious disputes thus consists of two claims: (i) DEFERENCE applies in contexts of religious disagreement only if such contexts are rationally impermissive (such that there is a unique doxastic response that is fully rational); and (ii) many religious disagreements are contexts where rationality is permissive.

Here is one way of motivating the first claim. Suppose that Al knows that Beth possesses more or less the same evidence as him concerning religious matters and that she is epistemically impeccable. Presumably, the discovery that Beth rejects Al’s religious views should lead Al to worry about his religious views only if this discovery gives Al reason to suspect that his view is not a fully rational response to his (pre-disagreement) evidence. Furthermore, Beth’s contrary religious viewpoint gives Al a good reason to suspect such irrationality only if it cannot be the case that there are multiple contrary religious viewpoints that are each a fully rational response to their evidence. In other words, the disagreement gives Al a good reason to question the rationality of his initial view only if their religious dispute is a context where rationality is impermissive. If full rationality permits a variety of religious perspectives in response to the same evidence, then religious disagreement does not raise worries about the rationality of one’s pre-disagreement religious views and the epistemic deference commended by DEFERENCE would seem to be unmotivated.

It is controversial whether DEFERENCE depends for its motivation on a non-permissivist conception of rationality as the above argument maintains. While advocates of conciliatory views do frequently characterize the worry raised by disagreement as a worry about the rationality of one’s pre-disagreement position, this need not be the only concern raised by disagreement (Christensen 2014). Even if Al knew that his religious reasoning is perfectly rational, Beth’s disagreement could still raise a different sort of worry: Beth’s disagreement might constitute evidence that rational reflection on religious questions does not reliably lead to true religious beliefs. And the knowledge that the rational formation of one’s religious views does not reliably conduce to true belief arguably gives Al a defeater for his religious views, even if Al knows that, prior to learning of the disagreement, his views were fully rational. Of course, if almost all rational people agreed with Al and only a few with Beth, Al might be able to affirm that rational reflection on religious matters does reliably lead to the truth, and perhaps he could be untroubled by the fact that in Beth and a few others, genuinely rational reflection has led to false religious views. (Beth, being in the small minority, could not be similarly sanguine.) But if the number of apparently rational thinkers who are as informed as Al and yet disagree with him rivals the number who agree with him, then religious disagreement would supply evidence of the unreliability of rational religious reflection and may on that account defeat confident religious belief.

Supposing that DEFERENCE does depend on a non-permissive conception of rationality, despite the preceding reflections, is it plausible to maintain that rationality is permissive with respect to many religious questions? It seems fairly clear that there are contexts where rationality is not permissive. For example, someone’s credence that they will win a particular lottery ought to conform to the mathematical odds (absent any reason to suspect supernatural intervention or corrupt lottery officials). But many philosophers find it implausible to suppose that the requirements of rationality are equally precise in domains of inquiry like religion. The types of evidence and rational standards that govern views on the reality of an afterlife, for example, seem too coarse-grained to admit of a precise credence that is maximally rational, or even a maximally rational credence range with precise endpoints. (For a vigorous defense of the uniqueness thesis, see White [2005].) Also, even if we are concerned not with credences but with the coarse-grained doxastic attitudes of belief, disbelief, and withholding (neither believing nor disbelieving), it seems implausible to suppose that there are no borderline cases where either of two attitudes (for example, belief or withholding) could be fully rational.

Even if there are good reasons for thinking that rationality is permissive in religious contexts, it is not clear that an appeal to permissive rationality can defuse worries raised by religious disagreement. First, the affirmation that rationality permits fundamentally opposed religious perspectives appears to be incompatible with certain religious perspectives. The apostle Paul, for instance, asserts that creation provides evidence of God’s eternal power and divine nature that is plain to all, and that the wicked who turn away from God “suppress the truth” (Romans 1.18ff). Second, even if permissivism is correct and religiously acceptable, questions may be raised about just how permissive rationality is. Perhaps weak belief in the reality of an afterlife as well as agnosticism on the question could both be fully rational responses to a given body of evidence. But could confident belief as well as confident disbelief in the afterlife both be fully rational responses to a given body of evidence? An extreme permissivist view that answers this question in the affirmative is significantly more controversial than permissivism itself.

Because it is not clear how permissive rationality is, if indeed it is permissive, with respect to religious questions, it is not clear what degree of religious disagreement among those with similar evidence is required to indicate likely irrationality on the part of at least one of the parties to the disagreement. But religious disagreement is notable for being so extreme. Many Christians, for example, are utterly convinced that God raised Jesus from the dead; on the contrary, many atheists think that theism, not to mention Jesus’ resurrection, is fanciful nonsense that can be dismissed out of hand. Perhaps no other domain of inquiry exhibits this degree of doxastic polarization. Those who appeal to permissivism in order to defuse worries raised by religious disagreement must therefore affirm a very strong form of permissivism according to which rationality radically underdetermines the appropriate response to a given body of evidence. Even philosophers who are inclined towards permissivism may find such an extreme form of the view unpalatable and implausible.

4. Religious Belief and the Problem of Judging Epistemic Credentials

If DEFERENCE is correct, despite the permissivist challenge just considered, then the religious views of apparent epistemic peers or epistemic superiors on religious matters ought to be accorded significant weight. But how are we to determine who our epistemic peers and superiors are? Asked differently, how are we to assess epistemic credentials? As discussed in section 2, many affirm some principle like INDEPENDENCE and maintain that epistemic credentials ought to be assessed in a dispute-independent manner. The fact that INDEPENDENCE helps to explain the intuitive verdict in the calculation case discussed above does lend it some plausibility. But as already noted, some proponents of conciliatory theories deny that we are always required to rely only (or even primarily) on dispute-independent reasons in responding to disagreement. Thus, even if INDEPENDENCE is on the right track, there may be features of religious disagreements that distinguish them from the calculation case and that weaken or render inapplicable the anti-question-begging requirement that applies in the calculation case.

One potentially significant difference between religious disagreements and the calculation case (and similar cases that are used to motivate INDEPENDENCE) is that in many religious disagreements, there is no clear basis for arriving at a dispute-independent judgment concerning the epistemic qualifications of the parties to the dispute (Pittard 2014). In the calculation case, the robust track record information provides a good dispute-independent basis for estimating the reliability of each friend in answering questions like the one under dispute. But consider a dispute between, say, a theist and an atheist. What nonpartisan, dispute-independent grounds do the disputants have for arriving at an estimation of their epistemic credentials concerning the question of God’s existence? One might think that they should compare their track record on other religious questions: for example, whether a relationship with God would be a great good, whether the sorts of suffering we observe in this world could be redeemed by God (should God exist), or whether there are plausible naturalistic and non-teleological explanations for the existence and character of our universe. However, it is quite clear that this procedure is unlikely to yield a nonpartisan assessment of their respective epistemic credentials. The atheist and theist will probably disagree on these questions as well, for reasons that are not independent of their dispute concerning theism, preventing them from arriving at a dispute-independent calculation of their “religious track record.”

If religious track records cannot provide the atheist and the theist with a basis for a nonpartisan assessment of epistemic credentials, one might think that they can arrive at such an assessment by considering the degree to which they each exhibit the intellectual capacities and epistemic advantages that are most important for a reliable assessment of religious claims. For example, they could estimate one another’s intelligence and analytic sophistication by means of some indicator like academic performance, and through conversation they could perhaps ascertain how extensively each of them has studied topics relevant to an assessment of theism. Unfortunately, in a wide range of religious disputes, this sort of procedure is unlikely to deliver a dispute-neutral assessment of epistemic credentials. This is because many systems of religious belief include incompatible views on what qualifies one to reliably assess religious claims. In this respect, religious disagreement is quite different from controversies in many other domains. Two civil engineers with opposing views on the merits of some bridge proposal will most likely agree on what sort of training and cognitive capacities are required to be a good judge of engineering questions, and they probably agree on which institutional signals (for example, academic degrees, professional experience, publications) serve as reliable evidence that someone possesses the requisite capacities. In many religious disputes, however, whether the disputed proposition is true or false has significant implications for the question of which qualifications best position one to assess the disputed proposition. Some Buddhists, for example, maintain that meditative disciplines are required in order to loosen the grip of certain delusions and to enable an adequate appreciation of the truth of Buddhist teachings concerning the non-existence of a personal self. Those who have considered Buddhism and who are not convinced are unlikely to accept that these meditative disciplines are an important qualification for an assessment of Buddhist claims. To consider another example, a Christian, inspired by the apostle Paul’s writings in I Corinthians 1-2, might affirm that scholarly credentials and analytic sophistication do not help one to see the truth of the Christian message, but that the key qualification is the possession of a divinely-given insight into the beauty and excellence of God as portrayed in the Christian message. Non-Christians will clearly not share this view concerning which qualifications are most important.

In many religious disputes, then, questions about which qualifications are most important cannot be separated from the primary religious matter under dispute, so that there is no shared theory of epistemic credentials that could ground a dispute-independent assessment of the disputants’ qualifications. If Christensen’s moderate conciliatory position is right and significant conciliation is required only when one has positive dispute-independent reasons for trusting the other side, and if in many religious disagreements there is no basis for a dispute-independent assessment of epistemic qualifications since questions about which credentials are relevant are caught up in the dispute at hand, then it seems that the correct conciliatory view may not require significant conciliation in many religious disputes.

Against this conclusion, one might protest that even if there is no nonpartisan theory of epistemic credentials that one can employ for a dispute-independent assessment of epistemic qualifications, it is quite probable that one’s own partisan view on epistemic credentials will give one reason to trust the other side. And if that is the case, then surely conciliation will be required. Suppose that atheists maintain that the most significant qualification in assessing theism is the possession of philosophical aptitude, and that theists maintain that the most significant qualification is a selfless love for others, which they think properly disposes the heart to see the truth of “divine things.” While there is no dispute-neutral theory of epistemic credentials in this case, it is certainly possible that the atheists’ own theory will give them a reason to assign significant weight to the views of theists if there are numerous theists who are philosophically qualified, and that the theists’ own theory will give them a reason to assign significant weight to the views of the atheists if atheists exhibit just as much selfless love as theists. In this case, both sides would, by their own light, have reason to significantly reduce confidence in their respective views. In fact, we could say that both sides do have a dispute-independent reason for trusting the other side, the reason being that on either of the competing theories concerning which credentials are most relevant, there is reason to think the other side highly qualified.

As the above rejoinder shows, the mere lack of a common perspective on the relevant epistemic credentials is not enough to escape the threat posed by disagreement, even given a more moderate conciliatory view like Christensen’s. Conflicting theories of epistemic credentials will alleviate the worries posed by disagreement only if one’s partisan theory of epistemic credentials does not give one strong reason to trust the other side. Is there any reason to think that a theory of epistemic credentials that is part of some religious belief system will not supply strong reasons for thinking that adherents of other belief systems are epistemically qualified? This is, ultimately, an empirical matter that must be settled on a case by case basis: what does a given religious viewpoint say about which epistemic credentials are most important when it comes to religious matters? Does the theory of epistemic credentials implied by the religious perspective give strong reasons for thinking that many of those who reject the religious perspective in question are nonetheless epistemically qualified? Clearly, the answers to these questions could vary depending on which religious perspective we are inquiring about. Pittard (2014) gives some reasons to think that, in many cases at least, religiously-motivated theories of epistemic credentials will not supply strong reasons for thinking that those who reject the viewpoint in question are epistemically qualified. First, the credentials that are emphasized by religious traditions are often credentials the possession of which is not easily discernible. Taking inspiration from Jesus’ sermon on the mount, suppose that “purity of heart,” that is, untainted desire for God, is necessary in order to see the truth of divine things. Unlike more standard epistemic credentials that are relevant in mundane domains of inquiry, purity of heart is not something whose presence in one’s disputant can easily be confirmed or disconfirmed. And if the most important epistemic credentials pertaining to religious questions are unobservable, then one may not have very strong reasons for thinking that one’s disputant is qualified with respect to religious matters. Second, many systems of religious belief feature credentials that are unlikely to be possessed by someone who is not disposed to accept the belief system in question. Consider a Theravada Buddhist who maintains that the truth of “emptiness” is unlikely to be evident apart from substantial engagement in Buddhist meditation. While it is perhaps easy to confirm whether or not someone has practiced Buddhist meditative practices in a disciplined way, it is unlikely that someone would pursue years of Buddhist meditation unless she was already positively disposed towards Buddhism. When the putative epistemic credentials are self-selecting in this way, it is less likely that there will be large numbers of disbelievers who possess the credentials.

If (i) a dispute-independent evaluation does not provide strong reasons for thinking that the other side of a religious dispute is as credentialed as one’s own side, (ii) one’s own partisan theory of religious epistemic credentials also does not supply such a reason, and (iii) significant conciliation is required in disagreements only to the extent that one has strong reasons for thinking that the qualifications of those on the other side of the dispute rival or surpass the qualifications of one’s own side, as Christensen asserts, then the skeptical implications of religious disagreement may be quite limited.

Against this line of thinking, some have complained that a view on disagreement is too weak if it allows religious believers to set aside worries raised by disagreement simply because their religiously-motivated theory of epistemic credentials does not give them reason to highly estimate the credentials of their disputants. Lackey (2014, 308), for example, notes that we should not affirm the reasonableness of the sexist who dismisses disagreement-related worries on the grounds that his disputant is a woman. Similarly, she insists that one should not be able to escape worries raised by religious disagreement simply by affirming a partisan and contestable view on the nature of the relevant epistemic credentials. In response, one could point out that while the sexist’s position after dismissing his female disputant is highly unreasonable, this is compatible with its being the case that he applied the correct policy for responding to disagreements. The unreasonableness of his final position may be explained by the unreasonableness of the sexist views he held before the dispute, and not by the employment of an incorrect disagreement norm. After all, we should not expect that applying the right disagreement norm will correct for rational failures that one brings into a disagreement situation. Lackey considers this response and answers as follows: “If an atheist sticks to her guns with respect to her belief that God does not exist just by regarding the theist as her epistemic inferior, this is irrationality in her response to a disagreement. It is not clear what could justify relegating these failures of rationality to epistemology generally rather than to the epistemology of disagreement in particular” (311).

What follows if Lackey’s more expansive conception of the epistemology of disagreement is granted? Perhaps the correct disagreement norm will still allow that the significance of religious disagreement is sensitive to one’s theory of epistemic credentials, but with the added caveat that one’s theory of epistemic credentials can mitigate the worries raised by disagreement only if one’s adherence to the theory is reasonable and not just an unmotivated attempt to block disagreement worries. It isn’t clear how this changes the dialectical situation, since adherents of a particular religiously-motivated theory of epistemic credentials presumably think that their theory is reasonable, and thus not analogous to the prejudice of the sexist. On the other hand, the correct disagreement norm could deny that the evidential significance of religious disagreement is sensitive to what theory of epistemic credentials one happens to hold. One way to do this would be for the disagreement norm to simply stipulate the correct theory of epistemic credentials. But this would require taking a stand on questions that are contested on religious grounds. The resultant norm could not supply a religiously neutral motivation for religious skepticism. Alternatively, the norm could require that one’s assessment of a disagreement always be dispute-neutral, and that equal weight be assigned to both sides in those cases where there is no agreement on the relevant credentials. But such a strong conciliatory norm would require capitulation in disagreements with radical skeptics, which is what led Christensen and others to search for principled conciliatory policies with more modest anti-question-begging constraints. In short, it is not clear whether there is a conciliatory norm that is religiously neutral and not overly skeptical, but that completely forbids relying on one’s particular theory of epistemic credentials in assessing the significance of a religious disagreement.

It should be emphasized that a moderate conciliatory view like Christensen’s will require reduction of confidence in many religious disputes, even if it does not require significant conciliation in inter-religious disputes where the two sides share very little common ground. Significant doxastic revision will likely be required in a wide range of religious disputes between those with broadly similar positions. Consider a disagreement between two theologians who disagree over the finer details of some shared theological framework. Given their extensive theological agreement, each party to the dispute has strong dispute-independent reasons for thinking that the other person is quite reliable as a guide to theological matters. This suggests that a moderate conciliatory framework of the sort considered here would call for significant deference. So even if outright religious skepticism is not required, believers might be required to loosen their religious views by adopting an agnostic stance towards many intramural disputes.

5. Fundamental Versus Superficial Disagreements

In philosophical discussions of disagreement, one frequently encounters the view that fundamental disagreements—that is, disagreements driven by incompatible epistemic starting points—should occasion less doxastic revision than disagreements that are superficial. Many who readily concede that disagreement can easily defeat one’s belief about the answer to a multistep math problem, for example, deny that one’s fundamental moral, philosophical, or religious convictions are similarly vulnerable in the face of disagreement. The previous section pointed to one reason that arguably goes some way towards explaining why fundamental disagreements might be less worrying than superficial ones: it might be that in fundamental disagreements, it is unclear what the relevant epistemic credentials are and who possesses them, making it unlikely that one will have strong independent grounds for thinking that the epistemic credentials of those on the other side of a dispute either rival or surpass the credentials of one’s own side. This section briefly considers two different accounts as to why religious disagreements that are suitably fundamental will pose less of a skeptical threat, and then considers whether religious disagreements are fundamental in the relevant sense.

Bogardus (2013b) argues that while peer disagreement undermines what he calls “knowledge from reports,” it does not undermine “knowledge from direct acquaintance.” Knowledge from reports, according to Bogardus, is mediated knowledge that rests on the output of some cognitive faculty, while knowledge from direct acquaintance involves “immediate and unproblematic access” (9) to the truth of the known proposition. In a case of knowledge that p from direct acquaintance, one “just sees” that p is the case, and p is part of one’s evidence base. Bogardus cites our knowledge that 2+2=4 as a paradigmatic example of such knowledge. In a case of knowledge that p from a report, one does not “see” p directly but sees p by seeing q, where q is some proposition concerning the report of one or more cognitive faculties. In this case, q but not p is part of one’s evidence. A paradigmatic example of knowledge from reports would be a belief based on memory. Christensen’s bill calculation case also seems to be a case where something known from a report is the subject of peer disagreement. When one of the friends concludes that each person’s share is $43, he does not “just see” that $43 is the correct answer. Rather, what he “just sees” is that the answer he has reached after a series of calculative steps (many of which he probably does not remember) is $43, and this is the basis for his belief that each person’s share is $43.

Assuming that there are these two types of knowledge, it is not implausible to think that knowledge from direct acquaintance is less susceptible to higher-order defeat than knowledge from reports. Because knowledge from reports involves trusting the “readouts” of one’s “cognitive instrument,” such knowledge is understandably threatened by worries raised about the reliability of that instrument or by the fact that some other similar instrument–an epistemic peer–delivers an inconsistent “readout.” Knowledge by direct acquaintance, on the other hand, is more fundamental to one’s cognitive perspective in that it is not mediated by instrumental reports. If such knowledge is not based on the report of one’s cognitive faculties, that knowledge may not be similarly undercut when one learns that an epistemic peer’s faculties deliver an inconsistent report. Therefore, on Bogardus’ view, if a contested religious belief is known by direct acquaintance, or if a religious belief is based on some contested proposition that is known by direct acquaintance, then the party who enjoys such knowledge rationally ought to stand by the belief in the face of disagreement.

A second account as to why fundamental disagreements may pose less of a skeptical threat comes from Gellman (1993, 355ff.). Gellman argues that religious beliefs may be immune to defeat by disagreement if those beliefs are numbered among the “rock bottom” epistemological starting points that supply the basis for epistemic evaluation of other beliefs. However religious believers may have come to initially acquire their religious beliefs, for many believers these beliefs come to achieve rock bottom status, alongside other commitments, such as basic rational principles and fundamental beliefs about the world, that serve as justifiers of other beliefs and that do not themselves stand in need of grounding. Gellman acknowledges that there is a hierarchy among such rock-bottom beliefs: some of these beliefs are given more weight in rational deliberation, and some are given priority in that they invariably trump other rock-bottom commitments in cases when they conflict. He also holds that for many religious believers, core religious beliefs are hierarchically prior to many of the rational norms identified by epistemologists, including norms like DEFERENCE and INDEPENDENCE described above. Given this priority, Gellman maintains that it would not be rational for the religious believer to abandon core religious beliefs just because this is what DEFERENCE and INDEPENDENCE require.

It is, of course, questionable whether the above thinkers are right in thinking that beliefs that are suitably fundamental are thereby protected from the disagreement threat. Many will question the Cartesian optimism implicit in Bogardus’ conception of knowledge from direct acquaintance. And even if there are fundamental beliefs that are presumed “innocent” and that therefore do not stand in need of evidential support, as Gellman claims, it need not follow that such presumptive innocence remains intact in the face of direct challenge from other qualified thinkers. Finally, even if the significance of disagreement is mitigated in fundamental disputes, it may be that neither Bogardus nor Gellman have adequately articulated the relevant sense of “fundamentality.”

Even once the relevant sense of fundamentality is fully clarified, the question of whether a given religious disagreement is fundamental will in many cases be a controversial one. This is because there is significant disagreement among philosophers of religion on the place that religious belief occupies in the believer’s “noetic structure,” and thus on the source of religious disagreement. Consider, for instance, the conflicting accounts of reflective theistic belief developed by Richard Swinburne (2004) and Alvin Plantinga (2000). Swinburne maintains that reflective theists who are aware of evidential challenges to religious faith, including facts about religious diversity, will typically be unable to take their theistic convictions for granted, but will need to proportion their credence in theism to the evidence. Swinburne holds, moreover, that evidential reasoning about God’s existence can and should employ the same principles of confirmation theory that are widely accepted in the sciences, and that the pre-evidential probabilities that serve as the starting point for such reasoning can and should be sufficiently determined by the application of generally-accepted inductive criteria such as explanatory scope and simplicity. This view seems to suggest that when two equally informed thinkers disagree on the plausibility of theism, the most plausible explanation is that at least one of them has made some mistake in the application of agreed-upon criteria that serve as the epistemic starting points for both parties. If this is right, then there is some reason to think that cases of religious disagreement can be assimilated to the calculation case discussed above, a case of disagreement that seems not to be fundamental since the dispute stems from performance error on the part of one of the thinkers rather than from any fundamental divergence in the disputants’ perspectives antecedent to the process of calculation.

In contrast to Swinburne, Plantinga maintains that for most theists, the belief in God is not the product of inference, but is basic in that it is not based on other beliefs. Plantinga acknowledges that theistic beliefs are often prompted by certain experiences: upon viewing a breathtaking mountain vista, one might find oneself believing that the world was created by God; or after doing some evil act, one might find oneself believing that God disapproves of what one has done. According to Plantinga, however, while these experiences may occasion theistic beliefs, these beliefs are not based on inferential reasoning that appeals to facts about these experiences as evidence. Instead, these beliefs are like the belief in other minds or the reality of the past or in the reliability of memory: such beliefs are held with a high degree of confidence whether or not we are aware of any good arguments in their favor. If Plantinga is right that theistic belief is not typically grounded in evidential reasoning, then there is reason to think that disagreements between theists and atheists are typically fundamental in a way that the disagreement in the calculation case is not. Disagreements over theism would not result from some performance error in inferential reasoning, but would be the product of differences in the basic outlooks of different thinkers.

The aim in comparing Swinburne and Plantinga is not to suggest that if Plantinga is right, theistic belief is fundamental in a way that lessens its vulnerability to defeat by disagreement (or that, if Swinburne is right, theistic belief is more vulnerable to the disagreement threat). While this is a conclusion that some have drawn, the principal aim of the comparison is to show that even if we agreed on a characterization of “fundamentality” that protects beliefs from being defeated by disagreement, there may very well be disagreement concerning the structure of religious belief and the question of whether it is fundamental in the relevant sense. While there is some reason for thinking that Swinburne’s view of theistic belief would place theistic belief on the non-fundamental side, there are also considerations that call this supposition into question. There are potentially significant disanalogies between theistic belief on Swinburne’s picture and the calculation case, which does seem to be a paradigm of a superficial disagreement. For example, in the calculation case, the several steps that led to one’s answer are presumably forgotten, and the exact source of the disagreement cannot be pinpointed. (And if it could be pinpointed, no doubt one party would recognize their error.) One might think that in many religious disagreements, disputants can rehearse the most important steps of the reasoning that grounds their view, and that as a result they can locate the precise point where their reasoning diverges. And a disagreement that persists even when the point of divergence has been identified is quite different from one where the disagreement persists precisely because the two parties cannot reconstruct their reasoning and thus cannot identify the point of divergence. The former sort of disagreement, which is driven by stable differences in how one applies inferential norms, is perhaps fundamental in a way that the calculation disagreement, which results from some obscured performance error, is not.

In addition to questioning whether Swinburne’s picture supports the “non-fundamental” characterization of religious disagreement, there is also room to question whether religious disagreement on Plantinga’s picture really does qualify as fundamental in the relevant sense. While Plantinga maintains that theistic belief is basic, one might argue that on his model theistic belief is an instance of knowledge from reports rather than knowledge from direct acquaintance. Even if theistic belief is not inferred from facts about the report of some cognitive faculty, the believer may believe in response to a report from some cognitive faculty (what Plantinga calls the sensus divinitatis) and may not “just see” that God exists. Consider: basic perceptual beliefs seem susceptible to defeat by disagreement in a way that basic mathematical beliefs are not. If two normal and (up till now) healthy friends are standing before an open garage, and one says he sees a car in the garage and the other says the garage is empty, it is reasonable to suppose that both of them should significantly lower their confidence in their initial belief, since it is likely that someone is hallucinating and neither has reason to think that their friend is more susceptible to hallucination than they are. However, if these two friends are talking and it comes to light that one of them believes that 1+1=2 while the other believes that 1+1=5, it is less plausible to suppose that the friend with the correct belief should reduce confidence to any significant extent. Both of these disagreements arguably involve conflicts between basic beliefs, but basic perceptual beliefs appear to be more vulnerable to defeat than basic mathematical beliefs. Perhaps this is because basic mathematical beliefs arise from a direct awareness of mathematical truths, while perceptual beliefs are mediated by the reports of perceptual faculties. This diagnosis and the preceding discussion involve a number of controversial claims and assumptions, controversies that will not be pursued here. The main point is that from the position that theistic belief is basic, it does not straightforwardly follow that theistic belief is among those beliefs that can plausibly be said to be resistant to defeat by disagreement in virtue of their fundamental status. The relevantly fundamental beliefs may be some subclass of basic beliefs, those that are the product of rational insight rather than the product of some perceptual or quasi-perceptual faculty.

6. Appeals to Religious Experience

For many religious believers, personal religious experience plays a crucial role in the formation, development, and sustaining of religious belief. Theravada Buddhists emphasize the importance of experiences arising from certain meditative disciplines, experiences that open the mind to the truth of certain Buddhist teachings. Charismatic Christians frequently refer to certain bodily sensations that serve as experiential signs of the presence and activity of the Holy Spirit. Theists of various stripes emphasize profound experiences of God’s presence and divine communication, experiences that frequently occur in times of prayer or worship but that may also come unbidden outside of any specific religious practice. In addition to claimed direct experience of God, many believers in God or gods purport to discern providential influence on their circumstances, and not infrequently believers claim to have witnessed or received physical healing in response to prayer. This is, of course, only a sample of the diverse religious experiences that are represented in religious traditions across the globe. Atheists who reject any religious viewpoint may also cite personal experiences in accounting for their disbelief—for example, experiences of silence and absence of divine comfort in a season of acute suffering.

The fact that such experiences frequently play a prominent role in motivating and supporting religious belief is potentially significant for an assessment of the epistemic significance of religious disagreement. Many who argue for the defeating power of disagreement are explicitly concerned with contexts where each side to the dispute has fully disclosed the grounds for their view. If disagreement is most worrying when it persists in context of full disclosure, then there is some reason to think that many religious disagreements will not present serious skeptical threats. To be sure, some “religious experiences” are such that their epistemically relevant content can easily be communicated to others. Suppose someone prays for a new car and the next day receives a car from a complete stranger who says that she felt moved to give her car away. The epistemically relevant aspects of such an experience could easily be communicated to others. (Whether the testimony would be believed is, of course, another story.) But one might think that in many instances of what we call “religious experience,” the content of the experience that is taken to be epistemically relevant cannot be communicated. Suppose that someone in desperate straits cries out to God for help and immediately experiences a “peace that surpasses all understanding” (Philippians 4:7), a peace that seems in its profundity to be a divine gift rather than a purely natural phenomenon. Could someone who believes in God partly on the basis of such experience fully disclose his reasons for belief? He could, of course, report having such an experience and describe the belief changes that seemed appropriate in its wake. However, the epistemic significance of the experience may significantly depend on subjective aspects of the experience whose qualities cannot be adequately communicated by means of verbal testimony (James 1902, 371). If this is right, then religious disagreements may be quite different from disagreements in many other domains where the subjective qualities of private experiences do not play a significant epistemic role.

There are reasons for doubting whether the significance of religious experience to religious belief could justify both sides of a religious dispute in confidently maintaining their religious beliefs in the face of disagreement (Schellenberg 2007, 182-3). Consider a disagreement between a Buddhist and a Muslim who both appeal to distinctive sorts of experiences in justifying their contested religious beliefs. While the Muslim does not herself experience the same sort of ineffable experiences that ground the Buddhist’s belief in, say, the doctrine of non-self, the Buddhist can tell the Muslim of his experiences and he can describe the doxastic responses that seem to him appropriate in light of the experiences. If the Muslim trusts the judgment of the Buddhist, then it seems that the Buddhist’s belief in non-self constitutes evidence that his experiences, combined with his other evidence, supply good evidence for the doctrine of non-self. Furthermore, evidence that there is good evidence for p is often itself evidence for p. Hence, the Buddhist’s belief in response to the reported experience may serve as a piece of proxy evidence that stands in for the experience itself. Since this proxy evidence is available to the Muslim, it seems that the incommunicability of the Buddhist’s experience does not prevent that experience from having indirect evidential weight for the Muslim. Of course, a symmetrical story can be told as to why the Muslim’s report of mystical experiences and her doxastic response can serve as proxy evidence that stands in for the experience itself and can be appreciated by the Buddhist. Assuming that both attach comparable weight to their experiences and have responded with equal conviction, there is arguably no reason for either thinker to maintain that his or her own experience should be given more evidential weight than the inaccessible experience of the trustworthy interlocutor. On this view, the inaccessibility of religious experience is unlikely to relieve religious believers of the worries raised by religious disagreement. As long as multiple sides accord significant weight to private experiences, there is a kind of epistemic symmetry that arguably demands a skeptical response.

Still, one might resist the above reasoning by noting, first, that we do not have some metric that we can use to measure and compare the apparent evidential value of various mystical experiences. We communicate the perceived evidential significance of our experience through coarse-grained descriptive language, like “a deep and incredibly profound sense of God’s love” or “a brilliantly clear insight into the unity of all things,” language that is not calibrated in a way that would allow us to make reliable interpersonal comparisons of the significance of different mystical experiences. It is possible that two speakers could both be reasonable in describing their religious experiences as, say, “utterly profound and clarifying” even though one person’s experience was in fact much more profound and clarifying than the other’s. The fact that two people use similarly strong language to describe their experiences is poor evidence that the experiences were comparable in their epistemic import. Of course, this by itself does not give one any reason for thinking that one’s own experience is likely to be more significant than someone else’s similarly-described experience. All the same, consider the case of some religious believer who has had a mystical experience of arresting intensity and profundity, and who attempts to convey the significance of this experience using fairly extreme language, and then discovers that believers from opposing standpoints use similarly extreme language to convey the apparent significance of their own mystical experiences. If the religious believer thinks that it is quite plausible that people would use similarly extreme vocabulary even if their experiences were much less profound and compelling than his own, and if he can easily entertain the possibility of others having less compelling experiences than his own but cannot easily entertain the possibility of others having experiences that are more compelling than his own, then he might be reasonable in believing that his own experience is evidentially more significant than the experiences of his disputants (despite the fact that these experiences are similarly described). According to this reasoning, religious belief that is grounded in surprisingly powerful experiences might be reasonably held in the face of religious disagreement even if multiple sides cite similar “powerful” religious experiences in explaining their view.

7. Faith and Practical Responses to the Problem of Religious Disagreement

Epistemic or “theoretical” rationality is the sort of rationality that is principally exhibited by someone’s beliefs, and the norms of epistemic rationality are concerned with such matters as logical consistency and evidential support. Practical rationality, on the other hand, is the sort of rationality that is principally exhibited by someone’s actions, and the norms of practical rationality are concerned with such matters as the compatibility of one’s various goals and the degree to which one’s actions conduce towards the attainment of those goals. The discussion thus far has proceeded under the assumption that whether religious conviction is rational in light of disagreement is a matter to be settled by the norms of epistemic rather than practical rationality. This assumption is contested by many who maintain that the reasonability of religious belief, or at least of religious faith, is best evaluated from the standpoint of practical rather than (or in addition to) epistemic rationality. According to these thinkers, reasonable religious conviction is often based not on the sort of evidential reasons that bear on the question of whether religious claims are true or probable, but instead on moral, prudential, or existential reasons for thinking that it would be in some way good or valuable to have some particular religious commitment. For example, a theist might believe in God for the reason that belief in God gives her a sense of deep purpose, both for her own life and for the cosmos as a whole, or because it helps her to maintain her moral commitments even when they lead to significant suffering. If religious belief may be rational in light of such practical reasons, and if religious disagreement does not pose a threat to the practical justification of religious belief in the same way that it threatens its epistemic justification, then the claim that religious belief ought to be abandoned on account of religious disagreement is arguably more questionable.

It might seem that practical reasons could make religious convictions rational only if those convictions are based on practical reasons, and religious convictions can be based on practical reasons only if they are voluntary. Many philosophers maintain that beliefs are not voluntary, and for this reason are not evaluable according to the norms of practical rationality. If this is right, then the rationality of religious belief is arguably a matter of epistemic rather than practical rationality. The faith of the religious “believer” may not always be an instance of belief in the conventional sense, however. Of the philosophers who have considered the nature of faith, a good number have argued for a “non-cognitive” conception of faith that does not require outright belief in the propositions that are the object of faith. On Alston’s (1996) view, for example, one may fail to believe a proposition while nonetheless “accepting” it as a matter of faith. Such acceptance is like belief in many respects—one views the world from a standpoint that takes the accepted proposition for granted, and one employs the accepted proposition as a premise in practical reasoning—but acceptance is a voluntary state that does not require believing the proposition or judging it to be more probable than not.

Even if non-cognitivists about faith are wrong and belief is essential to faith, there could still be reasons why religious faith is appropriately evaluated from the standpoint of practical rationality rather than (or in addition to) epistemic rationality. First, some contest the claim that belief is inevitably involuntary. William James (1896), for example, argues that belief is governed by two competing aims (“Believe truth! Shun error!”), and how these aims are prioritized in a given context may be a voluntary matter that helps to determine whether one ends up believing a given claim. Second, even if belief cannot be chosen in the rather direct manner supposed by James, there is little doubt that one can undertake courses of action that may indirectly influence one’s religious beliefs.

Granting that religious faith is responsive to practical reasons, either because it is a voluntary state that does not require belief at all or because religious belief can be directly chosen or indirectly influenced by voluntary means, what implications does this have for the rational significance of religious disagreement? Holley (2013) suggests that if commitment to a religious way of life is valuable, then religious belief will likely be practically rational. For engaging in a religious way of life tends to produce religious beliefs, and the exercise of epistemic discipline that would be required to avoid falling into belief is likely to be incompatible with genuinely entering into the way of life in question. For this reason, Holley maintains that one can be reasonable in persisting in religious belief even if systematic religious disagreement defeats the epistemic justification of religious belief. Just how much erosion of epistemic support can the practical grounds for faith tolerate? The answer is by no means clear. For example, even if one can reasonably accept a proposition for which one has a credence of around 0.5 (a credence that might be insufficient for belief), there still could be non-trivial credence thresholds below which acceptance is not practically rational. If this is right, then the degree to which a given claim enjoys epistemic support is not irrelevant to an assessment of the practical rationality of accepting that claim. Those who argue on Jamesian grounds that belief can be responsive to practical considerations often hold that believed propositions must be judged more probable than not, and that this judgment of probability should be responsive to purely epistemic considerations (Pace 2011). As these examples suggest, the mere fact that practical considerations are relevant to an assessment of religious faith does not mean that the practical rationality of faith can be settled without reference to its epistemic merits.

Even if we agreed that the merits of some religious belief that p should be evaluated according to purely practical criteria that in no way depend on the strength of the epistemic reasons for the belief, it is still possible that knowledge of religious disagreement could constitute a defeater that renders religious belief unreasonable. This is because in addition to disagreements about the truth of various religious truth claims, there is also disagreement concerning the merits of various practical or “pragmatic” arguments for religious belief. This disagreement could undermine the epistemic justification of the beliefs that constitute the practical grounds for religious belief, and one might think that a practical rationale whose epistemic justification has been defeated cannot ground reasonable religious belief. Suppose that Theo believes in God on purely practical grounds. Perhaps Theo believes on the basis of a Kantian argument that concludes that belief in God is important in order to engage in the moral life without despair. Alternatively, perhaps Theo believes for the Kierkegaardian reason that passionate commitment in the face of “objective uncertainty” is the highest form of human existence. Still another possibility, his faith might be a response to the prudential reasoning articulated in Pascal’s “Wager” argument. All of these arguments are the subject of immense controversy. If these arguments must be epistemically justified in order to make it practically rational to have religious faith, then disagreement would threaten to undermine religious faith even if the religious claims that are accepted by faith do not themselves stand in need of epistemic justification. Moreover, several opponents of religious faith offer arguments for the conclusion that religious belief is positively harmful, either to the believer or to society as a whole (Fumerton 2013). Practically rational religious belief arguably requires that one be epistemically justified in rejecting such arguments, but disagreement of the right sort might undercut such justification.

8. Conclusion

Even if individual attempts at characterizing the rational significance of religious disagreement prove controversial, for many thinkers, including many religious believers, the intuition that persistent religious disagreement poses a significant challenge to religious belief is incredibly strong. As this article has attempted to show, clarifying the nature and scope of that challenge requires not only that one resolve various controversial questions in the epistemology of disagreement, but also that one settle difficult questions concerning such matters as the place of religious belief in the noetic structure of religious believers, the epistemic significance of various types of religious experiences, the role played by practical reasons in grounding religious conviction, and the theories of religious epistemic credentials implied by various religious belief systems. Given the complexity of such questions, there is little doubt that the epistemic significance of religious disagreement will remain a topic of lively philosophical dispute.

9. References and Further Reading

  • Adams, Robert M. 1994. “Religious Disagreements and Doxastic Practices.” Philosophy and Phenomenological Research 54 (4): 885–90.
  • Alston, William P. 1991. Perceiving God: The Epistemology of Religious Experience. Ithaca, NY: Cornell University Press.
  • Alston, William P. 1996. “Belief, Acceptance, and Religious Faith.” In Faith, Freedom, and Rationality, edited by Jeff Jordan and Daniel Howard-Snyder, 3–27. Lanham, MD: Rowman & Littlefield.
  • Baldwin, Erik, and Michael Thune. 2008. “The Epistemological Limits of Experience-Based Exclusive Religious Belief.” Religious Studies 44 (04): 445–55.
  • Basinger, David. 1999. “The Challenge of Religious Diversity: A Middle Ground.” Sophia 38 (1): 41–53.
  • Basinger, David. 2002. Religious Diversity: A Philosophical Assessment. Aldershot, UK: Ashgate.
  • Bogardus, Tomas. 2013a. “The Problem of Contingency for Religious Belief.” Faith and Philosophy 30 (4): 371–92.
  • Bogardus, Tomas. 2013b. “Disagreeing with the (Religious) Skeptic.” International Journal for Philosophy of Religion 74 (1): 5–17.
  • Bourget, David, and David Chalmers. 2009. “The PhilPapers Surveys: Results, Analysis, and Discussion.” Accessed August 5, 2015. http://philpapers.org/surveys/.
  • Christensen, David. 2007. “Epistemology of Disagreement: The Good News.” Philosophical Review 116 (2): 187–217.
  • Christensen, David. 2009. “Disagreement as Evidence: The Epistemology of Controversy.” Philosophy Compass 4 (5): 756–67.
  • Christensen, David. 2010. “Higher-Order Evidence.” Philosophy and Phenomenological Research 81 (1): 185–215.
  • Christensen, David. 2011. “Disagreement, Question-Begging and Epistemic Self-Criticism.” Philosophers’ Imprint 11 (6): 1–22.
  • Draper, Paul, and Ryan Nichols. 2013. “Diagnosing Bias in Philosophy of Religion.” Monist 96 (3): 420–46.
  • Elga, Adam. 2007. “Reflection and Disagreement.” Noûs 41 (3): 478–502.
  • Everett, Theodore J. 2001. “The Rationality of Science and the Rationality of Faith.” The Journal of Philosophy 98 (1): 19–42.
  • Feldman, Richard. 2007. “Reasonable Religious Disagreements.” In Philosophers Without Gods: Meditations on Atheism and the Secular Life, edited by Louise M. Antony, 194–214. New York: Oxford University Press.
  • Frances, Bryan. 2008. “Spirituality, Expertise, and Philosophers.” Oxford Studies in Philosophy of Religion 1: 44–81.
  • Fumerton, Richard. 2013. “Epistemic Toleration and the New Atheism.” Midwest Studies In Philosophy 37 (1): 97–108.
  • Gellman, Jerome. 1993. “Religious Diversity and the Epistemic Justification of Religious Belief:” Faith and Philosophy 10 (3): 345–64.
  • Gutting, Gary. 1982. Religious Belief and Religious Skepticism. Notre Dame, IN: University of Notre Dame Press.
  • Hick, John. 2004. An Interpretation of Religion: Human Responses to the Transcendent. 2nd ed. New York: Palgrave Macmillan.
  • Holley, David M. 2013. “Religious Disagreements and Epistemic Rationality.” International Journal for Philosophy of Religion 74: 33–49.
  • “Ipsos Global @dvisory: Supreme Being(s), the Afterlife and Evolution.” 2015. Ipsos In North America. Ipsos. 2011. Accessed August 6, 2015. http://www.ipsos-na.com/news-polls/pressrelease.aspx?id=5217.
  • James, William. 1896. The Will to Believe and Other Essays in Popular Philosophy. New York: Longmans, Green and Co.
  • James, William. 1902. The Varieties of Religious Experience. New York: Random House.
  • Kelly, Thomas. 2005. “The Epistemic Significance of Disagreement.” Oxford Studies in Epistemology 1: 167–96.
  • Kelly, Thomas. 2011. “Consensus Gentium: Reflections on the ‘Common Consent’ Argument for the Existence of God.” In Evidence and Religious Belief, edited by Kelly James Clark and Raymond J. VanArragon, 135–56. New York: Oxford University Press.
  • Kelly, Thomas. 2013. “Disagreement and the Burdens of Judgment.” In The Epistemology of Disagreement: New Essays, edited by David Christensen and Jennifer Lackey. Oxford University Press.
  • King, Nathan L. 2008. “Religious Diversity and Its Challenges to Religious Belief.” Philosophy Compass 3 (4): 830–53.
  • Kitcher, Philip. 2014. Life After Faith: The Case for Secular Humanism. Yale University Press.
  • Koehl, Andrew. 2005. “On Blanket Statements About the Epistemic Effects of Religious Diversity.” Religious Studies 41 (04): 395–414.
  • Lackey, Jennifer. 2014. “Taking Religious Disagreement Seriously.” In Religious Faith and Intellectual Virtue, edited by Laura Frances Callahan and Timothy O’Connor. Oxford University Press.
  • Lasonen-Aarnio, Maria. 2014. “Higher-Order Evidence and the Limits of Defeat.” Philosophy and Phenomenological Research 88 (2): 314–45.
  • McKim, R. 2001. Religious Ambiguity and Religious Diversity. Oxford University Press, USA.
  • Pace, Michael. 2011. “The Epistemic Value of Moral Considerations: Justification, Moral Encroachment, and James’ ‘Will To Believe.’” Noûs 45 (2): 239–68.
  • Pittard, John. 2014. “Conciliationism and Religious Disagreement.” In Challenges to Moral and Religious Belief: Disagreement and Evolution, edited by Michael Bergmann and Patrick Kain, 80–97. New York: Oxford University Press.
  • Plantinga, Alvin. 1995. “Pluralism: A Defense of Religious Exclusivism.” In The Rationality of Belief and the Plurality of Faith, edited by Thomas David Senor, 191–215. Ithaca, N.Y.: Cornell University Press.
  • Plantinga, Alvin. 2000. Warranted Christian Belief. New York: Oxford University Press.
  • Quinn, Philip L., and Kevin Meeker. 2000. The Philosophical Challenge of Religious Diversity. New York: Oxford University Press.
  • Rotondo, Andrew. 2013. “Undermining, Circularity, and Disagreement.” Synthese 190 (3): 563–84.
  • Schellenberg, J. L. 2007. The Wisdom to Doubt: A Justification of Religious Skepticism. Ithaca, N.Y.: Cornell University Press.
  • Schoenfield, Miriam. 2014. “Permission to Believe: Why Permissivism Is True and What It Tells Us About Irrelevant Influences on Belief.” Noûs 48 (2): 193–218.
  • Swinburne, Richard. 2005. Faith and Reason. 2nd ed. New York: Oxford University Press.
  • Thune, Michael. 2010. “Religious Belief and the Epistemology of Disagreement.” Philosophy Compass 5 (8): 712–24.
  • Thurow, Joshua C. 2012. “Does Religious Disagreement Actually Aid the Case for Theism?” In Probability in the Philosophy of Religion, 209–24. Oxford: Oxford University Press.
  • Titelbaum, Michael G. 2015. “Rationality’s Fixed Point (Or: In Defense of Right Reason).” Oxford Studies in Epistemology 5.
  • van Inwagen, Peter. 1996. “It Is Wrong, Everywhere, Always, and for Anyone, to Believe Anything upon Insufficient Evidence.” In Faith, Freedom and Rationality: Essays in the Philosophy of Religion, edited by Jeff Jordan and Daniel Howard-Snyder, 137–53. Lanham, MD: Rowman & Littlefield.
  • Vavova, Katia. 2014. “Moral Disagreement and Moral Skepticism.” Philosophical Perspectives 28 (1): 302–33.
  • White, Roger. 2005. “Epistemic Permissiveness.” Philosophical Perspectives 19 (1): 445–59.

Author Information

John Pittard
Email: john.pittard@yale.edu
Yale University
U. S. A.

Karl Rahner (1904-1984)

RahnerKarl Rahner was one of the most influential Catholic philosophers of the mid to late twentieth century. A member of the Society of Jesus (Jesuits) and a Roman Catholic priest, Rahner, as was the custom of the time, studied scholastic philosophy, through which he discovered Thomas Aquinas. From Aquinas’ epistemology and philosophical psychology Rahner was introduced to the Aristotelian-Thomistic notion of abstraction. This theory holds that human beings, as embodied souls or spirits, directly know only that which is sensed; direct sensory knowledge is physical knowledge. The intellect, through complex actions best described as abstraction, draws from sensory knowledge. This knowledge is indirect but valid knowledge of spiritual or non-physical realities. Thus, Rahner, learning from Thomas, held that it is the abstractive power of the mind that leads to indirect knowledge of the spiritual. Kant led Rahner to the philosophical work of Joseph Maréchal, a fellow Jesuit. Maréchal attempted to use Kant to create a re-vitalized Thomism. Maréchal held that the dynamic of the mind transcends the dichotomy of phenomenon and noumenon by attaining the utter unity of the Absolute. Rahner learned from Maréchal that the Kantian frustration could be overcome by the dynamic of the mind. Finally Rahner learned from Pierre Rousselot, another Jesuit, that the mind’s dynamic is drawn to the Absolute because the Absolute is the pure unity of being and spirit. So from Rousselot, Rahner understands the absolute terminus of the dynamic of mind to be the pure unity of being and spirit. It is from these strands that Rahner weaves his unique philosophical system.

Table of Contents

  1. Life
  2. Influences
    1. Kant
    2. Rousselot
    3. Maréchal
    4. Heidegger
    5. Summary
  3. Rahner’s System
    1. Geist in Welt
    2. Hšrers des Wortes
  4. Summary
  5. References and Further Reading
    1. Karl Rahner: Primary
    2. Karl Rahner: Secondary
    3. Immanuel Kant: Primary
    4. Pierre Rousselot: Primary
    5. Pierre Rousselot: Secondary
    6. Joseph Maréchal: Primary
    7. Joseph Maréchal: Secondary
    8. Martin Heidegger: Primary

1. Life

Karl Rahner was born 5 March 1904 in the university town of Freiburg-im-Breisgau in the then Grand Duchy of Baden, the fourth of seven children. His father Karl was an educator; his mother Luise, a homemaker. Rahner’s mother was pious, but in a healthy sense: the atmosphere of a university town imbued that piety with openness. It can therefore be said that Rahner’s childhood laid the groundwork for his later complex philosophical and theological projects: a pious openness, seeking the most effective formulations to gain insight into the character of the world.

At age eighteen, on 20 April 1922, Karl Rahner entered the novitiate of the Society of Jesus. The Jesuits had been, since their inception, an intellectual religious congregation: among their number were philosophers such as Francisco Suárez; nascent biologists such as Athanasius Kircher, discoverer of microbes; missionary-linguists such as Matteo Ricci; paleontologist Pierre Teilhard de Chardin and cosmologist George LeMa”tre. It was thus the perfect environment in which Rahner might begin to develop his own thought: intellectually rigorous, pioneering, open.

After the completion of novitiate and the taking of vows Rahner entered the scholasticate, the formal program of studies. These studies were founded upon the current Neo-Scholastic manuals, much defamed but actually thorough presentations of Scholastic thought. Rahner was deeply influenced by Aquinas, of course; Aquinas had been mandated by Leo XIII as the Catholic philosopher. At the same time Rahner discovered three of the four major influences that would form his intellectual horizons: Immanuel Kant and fellow Jesuits Pierre Rousselot and Joseph Maréchal. It was, however, Maréchal, and Maréchal’s interpretation of Kant, who became the decisive impetus to Rahner’s ongoing philosophical reflections.

Rahner’s superiors soon noted the caliber of his intellect, and so he was sent to the University of Freiburg in Freiburg-im-Breisgau, his home, to begin doctoral studies in philosophy in 1934: his superiors foresaw for him a university career as a professor of philosophy. It was at Freiburg where Rahner, despite his sincere acknowledgements of the importance of Aquinas, Kant and especially Maréchal, discovered the philosopher whom he would later call his true teacher: Martin Heidegger. Until his death Rahner kept with him the list of courses he had taken with Heidegger. Heideggerian thought became the catalyst through which the transcendental philosophies of Kant, and especially Kant through Maréchal, began to coalesce into the Rahnerian philosophical system. In 1936, Rahner submitted his doctoral thesis, Geist in Welt, usually rendered Spirit in the World, which attempts a radical re-reading of Aquinas through Kant, Maréchal and Heidegger. Geist in Welt was essentially a lengthy gloss on a single question in Thomas’ Summa, an intricate, complicated, tightly woven, and impenetrable Maréchallian-Heideggerian interpretation. It was rejected as being too influenced by Heidegger. That same year Rahner was transferred to Salzburg to study theology, gaining a doctorate there.

Rahner then began his university career in 1937 at the University of Innsbruck. In that same year Rahner gave a series of lectures at Innsbruck; these became the basis for Rahner’s last purely philosophical work, his philosophy of religion and revelation, Hšrers des Wortes (Hearers of the Word).

During the war years and post-war years, 1939-1949, Rahner engaged in pastoral work in Vienna. After the war he returned to Innsbruck in 1949 and began to develop his theological system, a system rooted completely in the metaphysics of Geist in Welt and Hšrers des Wortes: human beings, finite, yet invested by an infinite and inexhaustible epistemological dynamic, are intrinsically open to the revelation of the utter mystery that is God. Thus religion is the thematization of the absolutely unthematic.

While at Innsbruck in 1962 Rahner’s superiors received a monitum from the Holy Office in Rome: Rahner was neither to lecture nor publish without Rome’s explicit permission. The irony: in that same year Pope John XXIII named Rahner a peritus to the Second Vatican Council. Rahner’s influence was profound; it was Rahner who was the principal behind the drafting of Lumen Gentium, The Dogmatic Constitution on the Church. The monitum, obviously, disappeared into bureaucratic oblivion. Rahner taught at Innsbruck until 1964. From 1964 until 1981 Rahner taught at the University of Munich. Rahner retired to Innsbruck, where he died, 30 March 1984. Karl Rahner was, and remains, a powerful influence upon the Roman Catholic Church. Rahner’s philosophy was at once the foundation and framework for his far-reaching and in some ways radically different re-reading of Roman Catholic dogma. It is important to note, however, that Rahner’s philosophical system precedes and is separable from the theological system built upon it.

2. Influences

In summary: Rahner derived the notion of the transcendental structure of knowledge from Kant, and from Rousselot and Maréchal he derived the notion of the infinite dynamic inherent in this transcendental structure. This infinite dynamic possesses an intrinsic inevitability toward the Absolute or God. Because of his exposure to Heidegger’s system of thought, Rahner ultimately came to characterize human beings as utterly finite yet as ever ordered to being.

a. Kant

Immanuel Kant (1724-1804) brought to crescendo the “philosophy of the subject” that had been steadily on the rise from time of the great Scholastics. For Kant an authentic subjectivity, one that at once addressed the real, however unknowable (the noumenal), and from that address structured the known (the phenomenal), was the only answer to the radical skepticism of Hume. It was in response to this skepticism that Kant created his great work, Kritik der Reinen Vernunft (Critique of Pure Reason) in two editions: the first (A) in 1781, the second, greatly expanded edition (B), in 1787. The Kritik was the impetus for Joseph Maréchal to establish Transcendental Thomism, which, in turn, decisively influenced Rahner. This is the central concern of the Kritik: how can one gain certitude? How, in the face of radical skepticism, can one be sure of the world? Simply put, is knowledge possible, and, if so, what is the guarantee of that knowledge? Kant reasoned that the facticity of this or that experience is formed within a grid of pre-determined schemata, and from this application there emerges consistent, verifiable and, thus, dependable knowledge. These schemata are the a priori structures of reason. These a priori structures, the categories, render that which is experienced globally consistent and temporally consistent. The categories, the a priori structures of reason, are therefore frameworks to which the to-be-known must conform to be known. Thus, Kant finds the consistency and dependability of knowledge in the constant a priori schemata or categories proper to reason as reason. Kant was fully satisfied that he had established a lasting guarantee of the certainty of knowledge. Joseph Maréchal, with Heidegger serving as the decisive influence upon Rahner’s philosophical thought, would serve as mediator between Kant and Rahner. It is through Maréchal’s system that transcendental idealism is married to Aquinas’ Aristotelian epistemology. The mind, dynamic in its address of that which is to be known, structures the known through abstraction, but this abstraction is a neo-Kantian impetus to the Absolute or God.

b. Rousselot

Pierre Rousselot (1878-1915) was ordained in England in 1908. Remarkably (and in significant demonstration of his intellectual capabilities) Rousselot completed his theological preparations for ordination while simultaneously earning the customary two doctorates from the Sorbonne in 1908, the year of his ordination: the major thesis, L’intelletualisme de saint Thomas, and the minor thesis, Pour l’historie du problem de l’amour au Moyen Age. In L’intellectualisme Rousselot created an entirely unique Neo-Thomist system, one he styled as a Platonized Thomism. The entire system hinges on the identification of spirit and being in divinity as the very nature of the Godhead. This was an attempt at a defensible interpretation of Thomas. The unity of spirit and being that is God is thus the self-knowing of God as his being: in other words, God is infinite intelligibility utterly transparent in pure self-possession. Rousselot takes this as his model for all forms of knowledge. Rousselot holds that every act of comprehension, of understanding, as the discovery and appreciation of intelligibility, is in fact an affirmation of the existence of the God which is pure intelligibility. From Rousselot, Rahner came to appreciate the identity of spirit and being, and thus the intrinsic intelligibility of all which, when realized in the act of intellection, requires the co-affirmation of the existence of God. Rahner learned from Rousselot that knowing strengthens the relation to the Absolute. For Rahner the mind is an organ of the affirmation of divinity. It was Rousselot who opened Rahner to Maréchal.

c. Maréchal

Maréchal and Heidegger were the decisive influences upon Rahner’s thinking. Joseph Maréchal was a contemporary of Rousseolt, but although there was productive dialogue between the two, Maréchal pursued a deliberately Kantian path. Joseph Maréchal (1878-1944) entered the Society of Jesus in 1895 and while in England during the Great War he began work on his system, explicated in the five volume opus Le Point du Départ de la Métaphysique: Leçons sur le Développment Historique et Théorique du Problème de la Connaissance (henceforward simply Le Point du Départ de la Métaphysique). MMaréchal sought in these five volumes to trace the history of western philosophy and describe what he thought to be the true philosophical system: a Thomism rethought in the light of Kant.

It is the fifth volume of Le Point du Départ de la Métaphysique, titled Le Thomisme devant la Philosophie Critique, that deeply influenced Rahner. Maréchal appreciates Kant’s notion that the mind is an active and dynamic structuring of that which it knows. However, he believes that Kant fails to honor the character of this dynamic. Thus, Maréchal sees the dynamic as twofold. First, it is the dynamic openness of the mind, for the mind seeks to grasp all it encounters. Second, the dynamic of the mind is invested with an intrinsic direction to a specific end. The dynamic of the mind, investing all that can be known, structuring the knowable to the known, is quite literally driven to an end (in Maréchal’s scholastic vocabulary, it is possessed of an irresistible “finality”) which is its ultimate goal. Maréchal argues that Kant does not appreciate the power of mind he has discovered. Maréchal sees that the Kantian transcendental dynamic, the mind structuring the knowable to the known, must ultimately terminate in absolute being. Maréchal holds that the dynamic, searching, structuring character of the mind will structure everything knowable to the known. This Kantian impetus is ultimately rooted in its trajectory toward the absolute, toward the absolute as absolute being in which absolute being is absolute truth. For Maréchal, as for Rousselot, every act of knowing is at once an implicit affirmation of the absolute that is absolute being as absolute truth. Here Rahner found the means to go beyond Kant and give systemic form to Rousseolt’s lyrical Platonized Thomism: Maréchal gave to Rahner the Thomist framework to both appropriate Rousselot’s themes of the identity of spirit and being in the pure intelligibility that is God and the utter dynamic of human knowing grounded in that identity.

d. Heidegger

It was Heidegger who was perhaps the greatest influence on Rahner. It is the marriage of Heidegger’s thinking to that of Maréchal’s that joins Heideggerian finitude as open-ness to being-as-irreducible to the Maréchallian dynamism of knowing intrinsically ordered to the Absolute. This move firmly established the foundation of Rahner’s philosophical system. Martin Heidegger’s (1889-1976) Sein und Zeit is the final and decisive influence on Rahner. Its themes, melded with those of Maréchal, give the cast to Rahner’s thought. The animating principle and overarching motif of Sein und Zeit is being or being at its most irreducible. In it Heidegger seeks to discover, amongst and through and beneath the myriad kinds of beings, that uttermost manner of being-ness that underlies this myriad. To discover this being-most-irreducible it is necessary to seek this being, and it is human beings who seek being-most-irreducible. Heidegger calls this being Dasein: literally, being-there, being-emplaced, being-in-and-of-the-world-of-beings. Dasein puts being-most-irreducible in question when it gives itself over to the mystery that is being-most-irreducible. Dasein becomes the possibility of being-most-irreducible revealed as it is. But how, then, does Dasein in its questioning quest reveal being-most-irreducible? Only in the authenticity of Dasein: being authentic. How does Dasein be-authentically? Through Dasein’s realization of its utter finitude. And how is finitude disclosed to Dasein? When Dasein as being-authentically accepts being-toward-death. The complete acceptance of death, the radical finitude of being, opens Dasein to being-most-irreducible, for radical finitude recognizes that which is most irreducible as the reply to that finitude. Rahner neatly synthesizes Maréchal’s dynamic of mind inevitably ordered to absolute being with Heidegger’s notion of Dasein.

e. Summary

It is through Maréchal that Rahner understands Kant and Maréchal’s notion of the dynamic of mind thrusting to the absolute being. This becomes the core of Rahner’s system. Rousselot provides the inspiration in the identity of spirit and being in knowing, and Heidegger’s thinking brings this together. The dynamic thrust of the knowing mind understands the being-most-irreducible as God and the unity of being and spirit in knowing. God is implicitly and intrinsically affirmed in the dynamic of every act of knowing. Thus, through Maréchal Rahner appropriates the Kantian notion of the transcendental structuring of the known by the mind. Through Rousselot and most especially Maréchal, Rahner sees this structuring of the known as a drive to attain the Absolute or God, and it is through Heidegger that this drive is rooted in radically finite human beings, and he discloses God as the identity of spirit and being.

3. Rahner’s System

Rahner’s system is fully explicated in Geist in Welt. Here the foundation is meticulously and densely established. Rahner’s second work, Hšrers des Wortes, forwards a system whereby the human and the divine are intrinsically ordered one to the other. For Rahner, human being as defined in Geist in Welt is intrinsically open to God or the Absolute. It is necessarily the receptacle of revelation. In Rahner’s view even if God or the Absolute remains utterly silent and completely hidden that silence and hiddenness, are, in fact, revelations.

a. Geist in Welt

Translating Geist in Welt as Spirit in the World demonstrates Rahner’s dense use of language. Geist qua spirit denotes both spirit as the unity of being and spirit as known and knowing in human reason, as demonstrated in the thinking of Rousselot. The preposition in does not simply note location but it is also indicative of a movement-towards; Welt is the Heideggerian welt, the world as the location of Dasein as the arena of the quest for Being-most-irreducible. Geist in Welt is remarkably simple in concept and extraordinarily complex in execution. It takes a single question and a single article from Thomas’ Summa Theologiae (ST) and uses them as the fulcrum to erect the Rahnerian synthesis. The question: ST I, q 84, a 7; seven hundred words devoted to the crux of Thomastic psychology and epistemology become the cornerstone of Rahner’s philosophical system.

Thomas notes that in this life the soul is joined to its body; it is through the physicality of its body that the soul interacts with the world. Thomas, following his master Aristotle, holds that the intellect (mind) cannot directly know what the body senses and perceives. The mind is spirit, the body matter. Keep in mind Thomas is not a dualist like Descartes; soul and body, spirit and matter are united, wholly interactive and completely congruent. However a transition from the sensed and the perceived to the known is necessary for Thomas. This transition occurs within the process of intellection (the process of the mind coming to understanding). Thomas, in a manner reminiscent of Kant, also holds that the mind does not directly know the world. Thomas did not see the mind possessing a priori structures. Rahner introduces a Kantian a priori thematic through Maréchal into Thomas. Thomas sees that there is a meditative structure to knowing and this mediation occurs in the following sequential constellation. The imagination receives the impressions of the experienced from the senses and the imagination creates an image of that which is experienced or the phantasm. The phantasm is received by the active (or agent) intellect, which abstracts from the phantasm the universal(s) proper to the object(s) experienced, and thus contained in the phantasm is this intelligible species. The passive (or possible) intellect receives the intelligible species and renders it to the verbum mentis, the “mind’s word,” which is the achieved comprehension and attained understanding.

So like Kant, Thomas does not believe human beings directly experience the world as it is; unlike Kant, there are no categories to structure the perceived world as understood. Rather there is a complex translation of the perceived (which is of the body) to the understood (which is of the soul). It is important to remember that Thomas insists human beings cannot know through their spiritual essence (here the soul) as do the angels. Human beings are of the world and so all human knowledge is the result of mediation and translation. It is this thematic that Rahner appropriates as the jumping off point of Geist in Welt and it is also this thematic that, through Maréchal and Heidegger, becomes the very core of Rahner’s philosophical system. Rahner begins with the process of abstraction which is the work of the agent intellect upon the phantasm. Rahner, along with Thomas, notes the world is the arena of the metaphysical. It is in and through the world that human beings through their agent intellect encounter and grasp thematically being-most-irreducible and also the unthematically present absolute being, God.

Thus: esse—Thomas’ word for being-most-irreducible—is, via Rahner, now the woauf of the Vorgriff, which is of the agent intellect. Woauf, a compound of prepositions, means: wo, where or how; auf, toward, up to, into. Thus, woauf might be rendered, however ungrammatically, as “potential-toward.” Vorgriff is another Rahnerian coinage: vor means before, previous, ahead; and greifen (from which griff is derived) means seize, grasp, hold or comprehend. Thus the Vorgriff is that prior projectedness to comprehension. Thus esse, grasped by the agent (active) intellect, is the potential toward which the prior projectedness to comprehension is directed. It is the Vorgriff, the projectedness to comprehension (as grasping, seizing), that is key to Rahner’s system. The Vorgriff bounds and compasses being-most-irreducible. It is at once, however, non-objective in this bounding and compassing. The Vorgriff is the condition of all knowing and the Vorgriff, as the bounding and compassing of being-most-irreducible, is a directedness to the absolute being, God, which is the ground of being-most-irreducible.

It is this Vorgriff that is the condition or possibility of the knowing of all objective beings. Indeed, all objective beings, and all possible objects of knowledge, are of the index of being that is the Vorgriff. The passive (possible) intellect is the self-becoming of human beings as knowing beings. Receiving the verbum mentis, which is the appropriation of the endless scope of the Vorgriff of the agent (active) intellect, the passive intellect is therefore the human spirit as identity of being and spirit. The passive intellect realizes through the active intellect the utterness of being-most-irreducible. The passive intellect becomes all beings as rendered knowable by the agent intellect through the Vorgriff. It is the full scope of being as spirit being as known which is the dynamic of the human mind encompassing being-most-irreducible. The human spirit directed toward being-most-irreducible through the Vorgriff as the potential prior-directedness to comprehension of being-as-irreducible. In turn the being-as-irreducible is constituted human as both the identity of spirit and being in the mind knowing all possible beings because it becomes all beings rendered knowable. Through this occurs the revelation of absolute being, God. This is the knowing of absolute being as last-knowable-being but knowable only in its infinitely distant obscurity. Geist in Welt blends the following: 1.) Maréchal: the themes of dynamism; 2.) the co-affirmation of absolute being with the grasp of being-most-irreducible and all possible beings to be known in the act of knowing of the human being; 3.) Heidegger’s themes of being-most-irreducible; 4.) Dasein as unformed and thus the self-constituting embeddedness of human being in the world and 5.) the world and the beings of the world as the means to discover being-most-irreducible.

b. Hšrers des Wortes

In Hšrers des Wortes Rahner restates, more lucidly, his themes from Geist in Welt. Recasting these themes in terms of metaphysics, Rahner notes that metaphysics addresses the question of being as being-most-irreducible. Metaphysics formulates the question to the beingness of beings to being-most-irreducible. The question of the beingness of beings, being-most-irreducible, addresses the ground of this beingness, this being-most-irreducible. These questions arise because of the ultimate unity of being and being-known. Rahner called this unity of being and being known, in another neologism, Bei-sich-sein, being-present-to-itself. Beingness is analogical. There are degrees of being-present-to-itself as the unity of being and being known just as there are degrees of intensity of the self-presence, the unity. God is absolute being and therefore absolute being-present-to-itself. For Rahner God is the absolute unity of being and being-know, the absolute possession of beingness; therefore God is the ground of being-most-irreducible. Human being, through the Vorgriff, constitutes itself as the dynamic self movement of spirit, the identity of being and being-known-to the absolute compass of all possible beings-as-knowable.

This movement requires the co-affirmation of the absolute being, God, as the being characterized by absolute self-possession of being. God as God, the pinnacle of the analogy of being-present-to-itself, is the possibility of the Vorgriff. Thus, human being as spirit is the openness of the finite to god, the absolute infinite. Human being as spirit is the dynamic self-movement of transcendence to absolute being to God, and thus the possibility of the disclosure of this Absolute Being. The absolute being-present-to-itself that is God is correlated to human being as an endlessly dynamic self movement of transcendence and thus the analogy of being-present-to-itself in degrees of self-possession and the degrees of intensity of unities of being and being-known. This absolute transcendence of human being as spirit toward the infinity of beingness as the absolute self-presence of absolute being is the limitless compass of the Vorgriff. In addition, this same Vorgriff is the possibility of the appearance of the limitless God to limited human beings. The Vorgriff is not limitless in itself, but it is limitless in the endless dynamic of spirit. In this endless dynamic is the co-affirmation of absolute being in the limitless compass of the Vorgriff as it addresses all possible beings as knowable. Fundamentally, the Vorgriff is the awaiting of the disclosure of the absolute being present in that dynamism. Therefore, Rahner declares that this self-disclosure is inevitable and even the silence of refusal is disclosure of absolute being.

4. Summary

Rahner’s utterly unique reading of Thomas through Maréchal and Heidegger cost him his doctorate in philosophy at Freiburg. Yet Geist in Welt demonstrates the fecundity of Rahner’s mind. Taking a single question from the Summa and but a single article in that question, Rahner, using medieval epistemological categories, weaves Maréchal, Rousseolt, and Heidegger into a vibrant transcendental synthesis. Other Catholic philosophers remained closer to Maréchal, especially Francophone philosophers. These were the practitioners of Transcendental Thomism. Rahner’s philosophy forwards a unique transcendentalism from Thomism featuring 1.) the Heideggerian rootedness of human being in its world. This comprises the vast field of beings that is the medium through which being-most-irreducible is revealed; being-most-irreducible is the proper fulfillment of human being. 2.) The Rousselotian identity of knowing and being as spirit as the hierarchy of degrees of self-possession. 3.) The Maréchallian themes of the endless dynamism of mind and the intrinsic co-affirmation of absolute being in that dynamism. This is Rahner’s unique synthesis; it demonstrates the power of his mind as synthetic, the uniqueness of his insight to build this edifice on the alien foundation of medieval scholasticism, and the complexity and subtlety of his system-building skill.

5. References and Further Reading

a. Karl Rahner: Primary

  • Karl Rahner, Geist in Welt, dritte auflage MŸnchen: Kosel, 1941
  • Karl Rahner, Hšrers des Wortes, zweite auflage MŸnchen: Kosel, 1968
  • Karl Rahner, Spirit in the World New York: Continuum, 1994
  • Karl Rahner, Hearer of the Word New York: Continuum, 1994

b. Karl Rahner: Secondary

  • Patrick Burke, Re-interpreting Rahner: A Critical Study of his Major Themes NY: Fordham, 2002
  • Stephen Fields, Being as Symbol Washington DC: Georgetown, 2000
  • Karen Kilby, Karl Rahner: Theology and Philosophy London: Routledge, 2003
  • Thomas Sheehan, Rahner Athens OH: Ohio University Press, 1987

c. Immanuel Kant: Primary

Immanuel Kant, Kritik der Reinen Vernunft 1 & 2 (Bande III/IV, Werkausgabe in 12 Bande) Berlin: Suhrkamp, 1974

d. Pierre Rousselot: Primary

  • Pierre Rousselot, L’intellectualisme de St. Thomas 2. ed Paris: Beauchesne, 1908
  • Pierre Rousselot,The Intellectualism of St Thomas (translated, James Mahoney) New York: Sheed and Ward, 1935
  • Pierre Rousselot, Intelligence: Sense of Being, Faculty of God (translated, Andrew Tallon) Marquette WI: Marquette University Press, 2002

e. Pierre Rousselot: Secondary

John McDermott, Love and Understanding Rome: Gregorian University, 1983

f. Joseph Maréchal: Primary

Joseph Maréchal, Le Point de Depart de la Metaphysique 5 volumes Paris: Desclee de Brouwer, 1922

g. Joseph Maréchal: Secondary

Anthony M. Matteo, Quest for the Absolute DeKalb IL: Northern Illinois University Press, 1992

h. Martin Heidegger: Primary

  • Martin Heidegger, Sein und Zeit zehnte auflage Tubingen: Max Niemeyer, 1963
  • Martin Heidegger, Being and Time (translated, John Macquarrie and Edward Robinson) NY: HarperCollins, 2008

 

Author Information

Guy Woodward
Email: gwoodward127@gmail.com

U. S. A.

Dialogical Logic

Dialogical logic is an approach to logic in which the meaning of the logical constants (connectives and quantifiers) and the notion of validity are explained in game-theoretic terms. The meaning of each logical constant (such as “and”, “or”, “implies”, “not”, “every”, and so forth) is given in terms of how assertions containing these logical constants can be attacked and defended in an adversarial dialogue. Dialogues are described as two-player games between a proponent and an opponent. A dialogue starts with an assertion made by the proponent. This assertion can then be attacked according to its logical form by the opponent. Depending upon the kind of attack, the proponent can now either defend against, or attack, the opponent’s move. The two players alternate until one player is unable to make another move. In this case, the dialogue is won by the other player who made the last move. An assertion made in the initial move by the proponent is said to be valid, if the proponent has a winning strategy for it, that is, if the proponent can win every dialogue for each possible move made by the opponent. The dialogical approach was initially worked out for intuitionistic logic and for classical logic; it has been extended to other logics, among them modal logic and linear logic.

Table of Contents

  1. Introduction
  2. Argumentation Forms and the Meaning of Logical Constants
  3. Dialogues for Intuitionistic Logic
    1. Definition
    2. Examples
  4. Winning Strategies and Validity
    1. Winning Strategies
    2. Examples
    3. First-Order Winning Strategies
    4. Tertium Non Datur and the Principle of Non-Contradiction
    5. Dialogical Validity and Completeness
    6. Winning Strategies as Proofs
  5. Dialogues for Classical Logic
    1. Examples
    2. Classical Dialogical Validity and Completeness
  6. Origins and Recent Developments
  7. References and Further Reading

1. Introduction

Dialogical logic comprises three main constituents:

(i) Argumentation forms. The meaning of the logical constants (like “implies”, for example) is given by so-called argumentation forms. An argumentation form describes in terms of two possible kinds of moves, called attack and defense, how assertions containing a certain logical constant in main position can be attacked and defended. For example, the argumentation form for “implies” says that if one player asserts “A implies B”, then the other player can attack this assertion by claiming A, which can in turn be defended by the first player by claiming B. This reflects how logical constants are understood in everyday argumentation: someone arguing for “A implies B” has to be able to argue for B when being granted that A.

(ii) Dialogues. A dialogue is a single game played by two players, called proponent and opponent. The proponent moves first by making an assertion. Then players alternate moves. Each move has to be made according to an argumentation form. In addition, certain rules or conditions are imposed, which go beyond what has been laid down in the argumentation forms. An example of such a rule is that a defense against a certain attack cannot be repeated. There are rules that hold for both players as well as rules that restrict only one of the two players. An example for the latter is the rule that a statement containing no logical constant can only be asserted by the proponent after it has already been asserted by the opponent, whereas the opponent can assert such a statement at any time, if allowed by an argumentation form and not prohibited by other conditions. The proponent wins a dialogue if the opponent cannot make another move.

(iii) Winning strategies. The notion of winning strategy for dialogue games provides the dialogical notion of validity or, depending on the point of view, of provability. In dialogical logic, an assertion is called valid if there exists a winning proponent strategy for it. That is, an assertion is valid if the proponent can always win a dialogue for it, no matter what moves are made by the opponent. Depending on the conditions specifying the kind of dialogues which are played, there may or may not be a winning proponent strategy for a given assertion. This means that by changing the rules of the dialogue games one can obtain different notions of validity, such as intuitionistic validity or classical validity, for example.

2. Argumentation Forms and the Meaning of Logical Constants

Argumentation forms are formulated for an extended first-order language. The first-order language consists of formulas A, B, \ldots, A_1,\ldots, which are constructed from atomic formulas a, b, c, \ldots with the logical constants \wedge (conjunction; “and”), \vee (disjunction; “or”), \rightarrow (implication; “implies”), \neg (negation; “not”), \forall (universal quantifier; “every”) and \exists (existential quantifier; “there is”), together with terms t, which can be variables x,y,\ldots, and auxiliary signs “(”, “)” and ,”. The atomic formulas can be relation symbols of arbitrary arity taking terms as arguments. An example for a first-order formula is

    \[\forall x \exists y (a(x,y) \rightarrow b(x))\]

This language is extended by using ?1, ?2, ?\!\vee, ?t (for terms t) and ?\exists as special symbols (that contain a preceding question mark). In addition, the signatures P and O stand for the two players proponent and opponent, respectively. An expression e is either a formula or a special symbol. For each expression e there is a P-signed expression P\, e and an O-signed expression O\, e. These signed expressions are called moves in general. Examples for moves are

    \[P\, \forall x \exists y (a(x,y) \rightarrow b(x))\]

and

    \[O\, ?\!\vee\]

A signed expression is called assertion if the expression is a formula; it is called symbolic attack if the expression is a special symbol (there is no such thing as a symbolic defense). X and Y, where X \neq Y, are used as variables for P and O.

For each logical constant there is one argumentation form, which determines how a formula, with the respective logical constant as main constant, that is asserted by X can be attacked by Y, and how this attack can be defended by X (if possible):

    \[\begin{array}{rlll} \wedge: & \text{assertion}: & X\, A_1 \wedge A_2 & \\ & \text{attack}: & Y\, ?i & (Y \text{ chooses } i = 1 \text{ or } i = 2)\\ & \text{defense}: & X\, A_i & \\ &\\ \vee: & \text{assertion}: & X\, A_1 \vee A_2 & \\ & \text{attack}: & Y\, ?\!\vee & \\ & \text{defense}: & X\, A_i & (X \text{ chooses } i = 1 \text{ or } i = 2)\\ &\\ \rightarrow: & \text{assertion}: & X\, A \rightarrow B & \\ & \text{attack}: & Y\, A & \\ & \text{defense}: & X\, B & \\ &\\ \neg: & \text{assertion}: & X\, \neg A & \\ & \text{attack}: & Y\, A & \\ & \text{defense}: & \mathit{no\ defense} & \\ &\\ \forall: & \text{assertion}: & X\, \forall x A(x) & \\ & \text{attack}: & Y\, ?t & (Y \text{ chooses the term } t)\\ & \text{defense}: & X\, A(x)[t/x] & \\ &\\ \exists: & \text{assertion}: & X\, \exists x A(x) & \\ & \text{attack}: & Y\, ?\exists & \\ & \text{defense}: & X\, A(x)[t/x] & (X \text{ chooses the term } t) \end{array}\]

The argumentation form for \wedge says that an assertion of the form A_1 \wedge A_2 made by player X can be attacked by the other player Y by choosing one of the two conjuncts A_1 and A_2. This is expressed by Y stating the special symbol ?1 or ?2, respectively. This attack can then be defended by player X by asserting the conjunct A_1 or the conjunct A_2 according to the choice of Y. A concrete instance of this argumentation form is the following:

    \[\begin{array}{l}P\, \neg a \wedge (b \vee a)\\O\, ?2\\P\, b \vee a\end{array}\]

Here the proponent P has asserted the conjunction \neg a \wedge (b \vee a). This is attacked by the opponent O choosing the second conjunct b \vee a, indicated by stating the special symbol ?2. The proponent defends against this attack by asserting the second conjunct b \vee a. Informally, someone claiming A_1 and A_2” has to be able to argue for A_1 and for A_2; an opponent can thus ask for any of the two.

For disjunctions of the form A_1 \vee A_2 asserted by player X, the attack by the other player Y is indicated by the special symbol ?\!\vee. The defending player X chooses one of the two disjuncts A_1 and A_2. Informally, someone claiming “A_1 or A_2” has only to be able to argue for one of the two disjuncts, and can therefore choose to argue for A_1 or for A_2, if the claimed disjunction is questioned.

In the case of implications A \rightarrow B asserted by player X, the attacking player Y asserts the antecedent A of the implication, and the defending player X asserts the consequent B. Informally, someone claiming “A implies B” has to be able to argue for B whenever A is given as an assumption.

Negated assertions \neg A made by player X can only be attacked, namely by the other player Y asserting A. There is no defense against such an attack. Informally, when claiming that “A is not the case”, one has to be able to argue against A.

The argumentation form for the universal quantifier \forall says that an assertion of the form \forall x A(x) made by player X can be attacked by player Y by choosing a term t in the symbolic attack Y\, ?t. Player X can then defend by asserting the formula A(x)[t/x], where the term t chosen by Y is substituted for (all occurrences of) the variable x in the formula A(x), also written A(t). Informally, someone claiming that “every object has the property A” has to be able to show for any object that it has the property A. An opponent can thus ask for any object (denoted by the term t, for example) whether it has the property A. This has then to be answered by an argument for A(t).

For existential assertions of the form \exists x A(x) made by player X, the attack by the other player Y is indicated by the special symbol ?\exists. The defending player X chooses a term t and asserts the formula A(x)[t/x], that is, the formula resulting from substituting t for (all occurrences of) x in A(x). Informally, someone claiming that “there exists an object with property A” has only to be able to present one object (denoted by the term t, for example) with the property A, and can thus choose to argue for A(t), when the claimed existence of such an object is questioned.

The argumentation forms give thus an explanation of the meaning of the logical constants by saying how assertions, which contain the respective logical constant in main position, are used in argumentations between the two players proponent P and opponent O. This explanation is intended to capture how assertions are used according to their logical form in actual argumentations.

In the literature, argumentation forms are also called particle rules or logical rules.

3. Dialogues for Intuitionistic Logic

Dialogical logic was at first developed for intuitionistic logic and for classical logic.

Classical logic is usually based on the principle of bivalence. Each assertion is either true or false, and the truth value of a compound assertion is determined by the truth values of its constituents. For example, the meaning of the logical constant \wedge is given by saying that assertions of the form A \wedge B have the truth value “true” if both conjuncts A and B have the truth value “true”; otherwise the truth value of A \wedge B is “false”.

Intuitionistic logic is not based on bivalence. Instead of employing truth values, the intuitionistic meaning of logical constants is usually explained in terms of proofs or transformations of proofs. For example, the meaning of \wedge is explained by saying that an assertion of the form A \wedge B has a proof if and only if both A and B have a proof. An implication A \rightarrow B has a proof if and only if one is in possession of a construction that transforms any proof of the antecedent A into a proof of the consequent B. The classical principle of tertium non datur (A \vee \neg A) is rejected in intuitionistic logic, and other classical principles such as double negation elimination (\neg\neg A \rightarrow A) do not hold. The set of intuitionistic theorems is a subset of the set of classical theorems. In particular, A \rightarrow B is not equivalent to \neg A \vee B in intuitionistic logic, whereas it is in classical logic. In intuitionistic logic, implication (\rightarrow) is a genuine logical constant; it cannot be expressed by using other logical constants such as negation (\neg) and disjunction (\vee).

Dialogical formulations can be given for both classical logic and intuitionistic logic. The difference between these formulations is made by different notions of dialogue, specified by certain conditions. Dialogues for intuitionistic logic are defined next. Dialogues for classical logic will be dealt with below, in Section 5. The argumentation forms underlying these notions do not differ; they are the same for both logics.

a. Definition

Dialogues for intuitionistic logic are defined with respect to the given argumentation forms to be finite or infinite sequences of moves satisfying the following dialogue conditions:

  1. The first move is made by the proponent P with the assertion of a non-atomic formula, and proponent P and opponent O alternate moves as determined by the argumentation forms.
  2. P may assert an atomic formula only if it has been asserted by O before.
  3. If there is more than one open attack, then only the last one may be defended.
    (An attack is open if it has not been defended yet. Attacks made according to the argumentation form for \neg are always open, since there is no defense against them.)
  4. An attack may be defended at most once.
  5. A P-signed formula may be attacked at most once.

These conditions are also called structural rules or frame rules in the literature. They are here given only informally. Condition 3 refers to one occurrence of an attack, and condition 4 refers to one occurrence of a P-signed formula. Hence, if an already defended attack is repeated, then condition 3 does not prohibit that this new occurrence of the attack can be defended once, too. This holds likewise for an already attacked P-signed formula. If this formula is again asserted by P, then condition 4 does not prohibit that this new occurrence can be attacked once as well.

It can be observed that proponent and opponent are not interchangeable. This is only due to conditions 1 and 4, which are asymmetric for P and O. In particular, the moves X\, A and Y\, \neg A do not amount to the same because of condition 1. The argumentation forms, on the other hand, are completely symmetric with respect to P and O, that is, the argumentation forms are player-independent.

The proponent wins a dialogue for a formula A if the dialogue is finite, begins with the move P\, A and ends with a move of P such that O cannot make another move, that is, every move that O could make according to the argumentation forms violates at least one of conditions 0 to 4.

b. Examples

Dialogues are written with position numbers on the left and with comments on the right. The comments make explicit what kind of move is made (attack or defense) and to which preceding move a move refers to. In this notation, moves have the format

    \[\langle\text{position number}\rangle \langle\text{signed expression}\rangle \langle\text{comment}\rangle\]

The following is a dialogue for the formula a \rightarrow (b \rightarrow a):

    \[\begin{array}{rll}0. & P\, a \rightarrow (b \rightarrow a) & \\1. & O\, a & (\text{attack on }0)\\2. & P\, b \rightarrow a & (\text{defense against }1)\\ 3. & O\, b & (\text{attack on }2)\\ 4. & P\, a & (\text{defense against }3)\end{array}\]

The dialogue starts with the assertion of the formula a \rightarrow (b \rightarrow a) by the proponent P in the initial move at position 0. This initial move is attacked by the opponent O at position 1 with the assertion of the antecedent a of the implication asserted by P at position 0. The attack is thus made according to the argumentation form for \rightarrow. In the next move at position 2, the proponent defends against this attack according to the argumentation form for \rightarrow by asserting the consequent b \rightarrow a of the attacked implication a \rightarrow (b \rightarrow a). The implication b \rightarrow a is attacked by O at position 3 by asserting its antecedent b. This attack is defended by P at position 4 by asserting a, the consequent of b \rightarrow a. Here P is allowed to assert the atomic formula a, since O has asserted a before (compare condition 1). These last moves are also made according to the argumentation form for \rightarrow. The opponent cannot make another move, since atomic formulas cannot be attacked (there are only argumentation forms for non-atomic assertions), and O cannot repeat attacks due to condition 4. The dialogue is thus won by P.

The following is a dialogue for the formula \neg a \vee (a \rightarrow a):

    \[\begin{array}{rll}0. & P\, \neg a \vee (a \rightarrow a)\\1. & O\, ?\!\vee & (\text{attack on }0)\\2. & P\, \neg a & (\text{defense against }1)\\3. & O\, a & (\text{attack on 2})\end{array}\]

The initial move is attacked by O with the symbolic attack O\, ?\!\vee according to the argumentation form for \vee. This attack can be defended by P either by asserting the left disjunct \neg a or by asserting the right disjunct a \rightarrow a. Here, the proponent chooses the former in the defense move at position 2. This is attacked by O with the assertion of a according to the argumentation form for \neg. The dialogue is not won by P, and P cannot make another move: the atomic formula a cannot be attacked, there is no defense against attacks on negated formulas, and due to condition 3 it is not possible to defend against the attack O\, ?\!\vee again.

Another dialogue for the same formula \neg a \vee (a \rightarrow a) is obtained if P defends against the attack O\, ?\!\vee by choosing to assert the right disjunct a \rightarrow a instead of the left disjunct at position 2:

    \[\begin{array}{rll}0. & P\, \neg a \vee (a \rightarrow a)\\1. & O\, ?\!\vee & (\text{attack on }0)\\2. & P\, a \rightarrow a & (\text{defense against }1)\\3. & O\, a & (\text{attack on }2)\\4. & P\, a & (\text{defense against }3)\end{array}\]

At position 3, the opponent attacks a \rightarrow a by asserting its antecedent a, which P defends against at position 4 with the assertion of the consequent a. This dialogue is finite and ends with a move of P such that O cannot make another move, that is, this dialogue is won by P.

These two dialogues for the formula \neg a \vee (a \rightarrow a) show that for a valid formula there can be dialogues which are won by P as well as dialogues which are not won by P, although every possible move has been made. There are also invalid formulas for which this is the case. An example is a \wedge (a \rightarrow a). If O attacks this formula with O\, ?2, then P wins the dialogue: the defense P\, a \rightarrow a is attacked with O\, a, which P defends against with P\, a as final move. If, however, O attacks with O\, ?1, then P cannot make another move, since the first conjunct a, which is an atomic formula, cannot be asserted because of condition 1.

A dialogue for the first-order formula \neg\forall x\neg a(x) \rightarrow \exists x a(x) is the following:

    \[\begin{array}{rll}0. & P\, \neg\forall x\neg a(x) \rightarrow \exists x a(x)\\1. & O\, \neg\forall x\neg a(x) & (\text{attack on }0)\\2. & P\, \forall x\neg a(x) & (\text{attack on }1)\\3. & O\, ?t_1 & (\text{attack on }2)\\4. & P\, \neg a(t_1) & (\text{defense against }3)\\5. & O\, a(t_1) & (\text{attack on }4)\end{array}\]

Instead of defending against O‘s attack (made at position 1) with P\, \exists x a(x) at position 2, the proponent attacks O‘s assertion of \neg\forall x\neg a(x) by asserting \forall x\neg a(x), according to the argumentation form for \neg. At position 3, the opponent attacks according to the argumentation form for \forall by choosing the term t_1 in the symbolic attack O\, ?t_1, which P defends against by asserting \neg a(t_1) at position 4. The opponent attacks this according to the argumentation form for \neg with a(t_1) in the last move. Note that there are now two open attacks made by O: the one at position 1 and the one in the last move. Due to condition 2, only the last of the two may be defended. Thus the proponent cannot catch up with the defense P\, \exists x a(x) of O‘s attack made at position 1, and according to the argumentation form for \neg there is no defense to the last move. The proponent can only repeat the attack made at position 2, which leads to an infinite dialogue where a sequence of moves like

    \[\begin{array}{rll}n. & P\, \forall x\neg a(x) & (\text{attack on }1)\\n+1. & O\, ?t & (\text{attack on }n)\\n+2. & P\, \neg a(t) & (\text{defense against }n+1)\\n+3. & O\, a(t) & (\text{attack on }n+2)\end{array}\]

is repeated endlessly; the opponent may choose a different term t in each repetition. Note that repeated attacks on the same move are prohibited only for O, while for P there is no such restriction. Hence P can repeatedly attack the move O\, \neg\forall x\neg a(x) (made at position 1) with the move P\, \forall x\neg a(x), starting the loop. Furthermore, each occurrence of an attack is defended at most once, and each occurrence of a P-signed formula is attacked at most once, in accordance with conditions 3 and 4, respectively. This dialogue for the formula

    \[\neg\forall x\neg a(x) \rightarrow \exists x a(x)\]

is therefore neither won by P nor does it end with an opponent move.

In summary, it can be observed that there are valid formulas for which there are finite dialogues that are not won by P as well as dialogues that are won by P, there are invalid formulas for which the same is the case, and there can be infinite dialogues, too. The notion of winning a dialogue is itself not sufficient for making a distinction concerning the validity of a formula.

4. Winning Strategies and Validity

In dialogical logic the central logical notion of validity is explained in terms of the game-theoretic notion of strategy. A strategy determines each move of a player. The crucial notion for validity is that of winning proponent strategy. Dialogical validity of a formula consists in the existence of a winning proponent strategy for that formula.

a. Winning Strategies

A player X has a winning strategy if for every move made by the other player Y player X can make another move, such that each resulting play of the game (that is, each resulting dialogue) is eventually won by X. In dialogical logic one is usually only interested in winning strategies for the proponent P. A winning proponent strategy for a formula A is a tree S whose branches are dialogues for A won by P, where the nodes are the moves, such that

  1. S has the move P\, A as root node (with depth 0),
  2. if the depth of a node is odd (that is, if the node is an opponent move), then it has exactly one successor node (which is a proponent move),
  3. if the depth of a node is even (that is, if the node is a proponent move), then it has as many successor nodes as there are possible moves for O at this position.

b. Examples

The following dialogue, which has been discussed above, is already a winning proponent strategy for the formula \neg a \vee (a \rightarrow a):

    \[\begin{array}{rll}0. & P\, \neg a \vee (a \rightarrow a)\\1. & O\, ?\!\vee & (\text{attack on }0)\\2. & P\, a \rightarrow a & (\text{defense against }1)\\3. & O\, a & (\text{attack on }2)\\4. & P\, a & (\text{defense against }3)\end{array}\]

This winning proponent strategy has only one branch, which is the dialogue shown. The root node

    \[P\, \neg a \vee (a \rightarrow a)\]

has only one successor, since the move O\, ?\!\vee is the only possible move for O. This node at depth 1 has exactly one successor node, namely the move P\, a \rightarrow a at depth 2. This in turn has again only one successor node, namely the move O\, a at depth 3, since no other opponent moves are possible. Its one successor node is P\, a, which has no successor as there are no possible opponent moves. The dialogue is won by P, and all possible opponent moves have been taken into account. This single branch tree is thus a winning proponent strategy for the formula \neg a \vee (a \rightarrow a).

In general, winning proponent strategies have more than one branch. If there are several opponent moves possible after a proponent move, then there will be a branch for each of the possible opponent moves. Consider the following winning proponent strategy for the formula (\neg a \vee b) \rightarrow (a \rightarrow b):

\begin{array}{rlcl}0. &\hspace{5em} & P\, (\neg a \vee b) \rightarrow (a \rightarrow b) &\\1. & & O\, \neg a \vee b & (\text{attack on }0)\\2. & & P\, a\ \rightarrow b & (\text{defense against }1)\\3. & & O\ a & (\text{attack on }2)\\4. & & P?\!\vee & (\text{attack on }1)\end{array}}
\begin{array}{lll|ll}5. & O\, \neg a & (\text{defense against }4)\, & O\, b & (\text{defense against }4)}\\{6. & P\, a & (\text{attack on }5) & P\, b & (\text{defense against }3)}\end{array}

 
This tree of signed expressions has two branches. After the move P\, ?\!\vee at depth 4 there are two possible moves for O, yielding the two successor nodes O\, \neg a (left branch) and O\, b (right branch). The proponent wins both resulting dialogues: O can neither make another move after P\, a (left dialogue) nor after P\, b (right dialogue). Thus, this tree is a winning proponent strategy.

c. First-Order Winning Strategies

Winning strategies for quantifier-free formulas are always finite trees, whereas winning strategies for first-order formulas can in general be trees of countably infinitely many finite branches. An example is the following winning proponent strategy for the first-order formula \neg\exists x a(x) \rightarrow \forall x \neg a(x):

\begin{array}{rcl}0. & \hspace{3em} P\, \neg\exists x a(x) \rightarrow \forall x \neg a(x) \hspace{3em}&\\1. & O\, \neg\exists x a(x) & (\text{attack on }0)\\2. & P\, \forall x \neg a(x) & (\text{defense against }1)\end{array}
\begin{array}{rl|l|l|ll}3. & O\, ?t_1 & O\, ?t_2 & O\, ?t_3 & \ldots\hspace{.85em} & (\text{attack on }2)\\4. & P\, \neg a(t_1) & P\, \neg a(t_2) & P\, \neg a(t_3) & \ldots & (\text{defense against }3)\\ 5. & O\, a(t_1) & O\, a(t_2) & O\, a(t_3) & \ldots & (\text{attack on }4)\\ 6. & P\, \exists x a(x) & P\, \exists x a(x) & P\, \exists x a(x) & \ldots & (\text{attack on }1)\\ 7. & O\, ?\exists & O\, ?\exists & O\, ?\exists & \ldots & (\text{attack on }6)\\ 8. & P\, a(t_1) & P\, a(t_2) & P\, a(t_3) & \ldots & (\text{defense against }7)\end{array}

 
The move P\, \forall x \neg a(x) at depth 2 has countably infinitely many successor nodes, since for each choice of a term t_i (for natural numbers i) the symbolic attack O\, ?t_i is a possible move. For pairwise distinct terms t_i the tree therefore has infinitely many branches (indicated by “\ldots”), where each branch is a dialogue won by P.

Such infinite winning strategies can be avoided by using the following restrictions with respect to winning proponent strategies:

  1. If the depth of a node is even, and the symbolic attack O\, ?y on the move P\, \forall x A(x) is a possible move, where y is a new variable in this branch, then O\, ?y is the only immediate successor node that is an attack on P\, \forall x A(x).
  2. If the depth of a node is even, P\, ?\exists is an attack on an assertion O\, \exists x A(x), and the move O\, A(x)[y/x] is a possible defense against this attack, where y is a new variable in this branch, then O\, A(x)[y/x] is the only immediate successor node that is a defense against P\, ?\exists.

There may be further possible moves, which are not attacks on P\, \forall x A(x) or defenses against P\, ?\exists. In this case the node at even depth has more than one immediate successor node. However, the number of these immediate successor nodes can only be finite, and there is thus no more infinite ramification within winning proponent strategies.

For example, the infinite winning proponent strategy indicated above is reduced to the following finite winning proponent strategy:

    \[\begin{array}{rll}0. & P\, \neg\exists x a(x) \rightarrow \forall x \neg a(x) & \\ 1. & O\, \neg\exists x a(x) & (\text{attack on }0)\\ 2. & P\, \forall x \neg a(x) & (\text{defense against }1)\\ 3. & O\, ?y & (\text{attack on }2)\\ 4. & P\, \neg a(y) & (\text{defense against }3)\\ 5. & O\, a(y) & (\text{attack on }4)\\ 6. & P\, \exists x a(x) & (\text{attack on }1)\\ 7. & O\, ?\exists & (\text{attack on }6)\\ 8. & P\, a(y) & (\text{defense against }7)\end{array}\]

The restrictions (i) and (ii) have the effect that in winning strategies only one of the possible attacks O\, ?t (for each term t) on P\, \forall x A(x) or defenses O\, A(x)[t/x] (for each term t) against P\, ?\exists has to be taken into account, namely one where the term t is a new variable, whereas in winning strategies which are not thus restricted one has to consider the corresponding moves for each term t, including variables. It can be shown that a restricted winning strategy can always be extended to an unrestricted one.

Infinite ramifications in winning strategies can also be avoided by replacing the player-independent argumentation forms for \forall and \exists by the following player-dependent argumentation forms:

    \[\begin{array}{rlll}\forall_{P}: & \text{assertion}: & P\, \forall x A(x) & \\ & \text{attack}: & O\, ?y & (\text{variable }y\text{ is new})\\ & \text{defense}: & P\, A(x)[y/x] & \\ \\ \forall_{O}: & \text{assertion}: & O\, \forall x A(x) & \\ & \text{attack}: & P\, ?t & (P\text{ chooses the term }t)\\ & \text{defense}: & O\, A(x)[t/x] & \\ \\ \exists_{P}: & \text{assertion}: & P\, \exists x A(x) & \\ & \text{attack}: & O\, ?t & (O\text{ chooses the term }t)\\ & \text{defense}: & P\, A(x)[t/x] &\\ \\ \exists_{O}: & \text{assertion}: & O\, \exists x A(x) & \\ & \text{attack}: & P\, ?\exists & \\ & \text{defense}: & O\, A(x)[y/x] & (\text{variable }y\text{ is new})\end{array}\]

The argumentation forms \forall_{P} and \exists_{O} contain the condition that the variable y is new. They are thus history-dependent in the sense that the possibility of a move O\, ?y or O\, A(x)[y/x] in a dialogue depends on whether the variable y has already occurred in this dialogue. The player-independent argumentation forms for \forall and \exists, on the other hand, are not history-dependent.

Winning proponent strategies for the resulting dialogues can then be restricted as follows: Only one successor node for a node at even depth has to be considered if

  1. the symbolic attack O\, ?y according to the argumentation form \forall_{P} is a possible opponent move,
  2. the symbolic attack O\, ?t according to the argumentation form \exists_{P} is a possible opponent move,
  3. or the opponent move defending a symbolic attack P\, ?\exists according to the argumentation form \exists_{O} is a possible move.

Again, further moves according to other argumentation forms may be possible. The resulting winning proponent strategies are finite. The use of the player-dependent argumentation forms has therefore technical advantages. However, from a conceptual point of view, player-independent argumentation forms might be preferable.

d. Tertium Non Datur and the Principle of Non-Contradiction

The proponent does not have a winning strategy for every formula. An example is the instance a \vee \neg a of tertium non datur. There is only one possible dialogue, namely:

    \[\begin{array}{rll}0. & P\, a \vee \neg a & \\ 1. & O\, ?\!\vee & (\text{attack on }0)\\ 2. & P\, \neg a & (\text{defense against }1)\\ 3. & O\, a & (\text{attack on }2)\end{array}\]

which is not won by P. At position 2, the proponent can only defend against O‘s symbolic attack O\, ?\!\vee by choosing to assert the right disjunct \neg a. Choosing the left disjunct is not an option due to condition 1. Due to condition 4, the opponent cannot repeat the symbolic attack O\, ?\!\vee at position 3; the only possible move for O is to attack P\, \neg a with O\, a. This attack can neither be defended nor can it be attacked, and another defense against the already defended symbolic attack O\, ?\!\vee is ruled out by condition 3.

On the other hand, there is a winning proponent strategy for each instance of the principle of non-contradiction, \neg (A \wedge \neg A). Consider the following winning proponent strategy for \neg(a \wedge \neg a):

    \[\begin{array}{rll} 0. & P\, \neg(a \wedge \neg a) & \\ 1. & O\, a \wedge \neg a & (\text{attack on }0)\\ 2. & P\, ?1 & (\text{attack on }1)\\ 3. & O\, a & (\text{defense against }2)\\ 4. & P\, ?2 & (\text{attack on }1)\\ 5. & O\, \neg a & (\text{defense against }4)\\ 6. & P\, a & (\text{attack on }5)\end{array}\]

Here it is essential that P can attack the same assertion repeatedly. The opponent’s assertion of a \wedge \neg a at position 1 is attacked by P first at position 2 by choosing the first conjunct, and again at position 4, now by choosing the second conjunct. Both attacks are necessary for having a winning proponent strategy.

There is no winning proponent strategy in the case of tertium non datur, since P cannot defend against the attack on a \vee \neg a repeatedly, while in the case of the principle of non-contradiction there is a winning proponent strategy because P can attack a \wedge \neg a repeatedly. In classical logic, tertium non datur and the principle of non-contradiction are equivalent, while in intuitionistic logic only the principle of non-contradiction holds. For the dialogues considered, this distinction rests upon the fact that P can repeatedly attack but not repeatedly defend against an assertion.

To sum up, there are formulas for which there exists a winning proponent strategy, and there are formulas for which this is not the case. A given formula can also have more than one winning proponent strategy.

e. Dialogical Validity and Completeness

The dialogical notion of validity is defined as follows:

A formula A is called valid if there is a winning proponent strategy for A.

That this dialogical notion of validity corresponds exactly to intuitionistic provability is the content of the following completeness result:

A formula A is valid if and only if A is provable in intuitionistic logic.

Hence, for the dialogues defined by conditions 0 to 4, one obtains a dialogical formulation of intuitionistic logic.

Provability is closed under uniform substitution of formulas for atomic formulas. That is, if a formula A is provable in intuitionistic logic, then each substitution instance A', obtained by uniformly substituting formulas for atomic formulas in A, is provable in intuitionistic logic, too. The completeness result implies that also validity is closed under uniform substitution. That is, if there is a winning proponent strategy for A, then there are winning proponent strategies for each instance A' of A that is the result of a uniform substitution of formulas for atomic formulas in A.

f. Winning Strategies as Proofs

Dialogues can also be viewed as constituents of a proof system. On this view, the proofs of a formula A are the winning proponent strategies for A. Completeness is then formulated as an equivalence theorem for winning proponent strategies and proofs in a given proof system such as, for example, sequent calculus:

There is a winning proponent strategy for A if and only if A is provable in sequent calculus for intuitionistic logic.

A constructive proof of this theorem has been given by Felscher [1985] by showing that there are algorithms transforming any winning proponent strategy for a formula A into a proof of A in sequent calculus for intuitionistic logic, and, the other way round, transforming any proof of A into a winning proponent strategy for A.

5. Dialogues for Classical Logic

A dialogical rendering of classical logic is obtained by relaxing conditions 2 and 3 for the proponent P, while keeping them for the opponent O. That is, conditions 2 and 3 are replaced by the following conditions:

2′. If there is more than one open attack by P, then only the last one may be defended by O.

3′. An attack by P may be defended by O at most once.

For P this means that if there is more than one open attack made by O, then P may defend against any of these attacks (instead of only the last one), and P can defend against attacks made by O repeatedly.

No changes are made to the argumentation forms. Classical dialogues are defined with respect to them to be finite or infinite sequences of moves made according to conditions 0, 1, 2′, 3′ and 4.

Classical winning proponent strategies for a formula A are defined as before in the case of intuitionistic logic, but with the notion of dialogue replaced by the notion of classical dialogue.

a. Examples

There is a classical winning proponent strategy for the formula a \vee \neg a. It consists in the following classical dialogue:

    \[\begin{array}{rll} 0. & P\, a \vee \neg a & \\ 1. & O\, ?\!\vee & (\text{attack on }0)\\ 2. & P\, \neg a & (\text{defense against }1)\\ 3. & O\, a & (\text{attack on }2)\\ 4. & P\, a & (\text{defense against }1)\end{array}\]

At position 2, the proponent can defend against O‘s symbolic attack O\, ?\!\vee only by asserting the right disjunct \neg a, since the atomic left disjunct a has not been asserted by O yet (compare condition 1). But due to the replacement of condition 3 by condition 3′, the proponent can defend against this attack again at position 4, now by asserting the left disjunct a, which has been asserted by O at the preceding position 3.

There is a classical winning proponent strategy for the formula \neg\neg a \rightarrow a, an instance of the intuitionistically invalid principle of double negation elimination. It consists in the following classical dialogue:

    \[\begin{array}{rll} 0. & P\, \neg\neg a \rightarrow a & \\ 1. & O\, \neg\neg a & (\text{attack on }0)\\ 2. & P\, \neg a & (\text{attack on }1)\\ 3. & O\, a & (\text{attack on }2)\\ 4. & P\, a & (\text{defense against }1)\end{array}\]

The last move is possible due to the replacement of condition 2 by condition 2′. At position 3 there are two open attacks by O (made at positions 1 and 3, respectively). Condition 2 would prohibit P‘s defense against the first attack, since this is not the last open attack. But condition 2′ enables P to defend against any earlier open attack. At position 4, the proponent can thus defend against O‘s first attack by asserting the consequent a of the attacked implication \neg\neg a \rightarrow a.

These two examples show that the replacement of both conditions 2 and 3 by conditions 2′ and 3′, respectively, is necessary to obtain classical logic. Otherwise, there would not be winning proponent strategies for (all instances of) either tertium non datur or the principle of double negation elimination, which are both principles of classical logic.

b. Classical Dialogical Validity and Completeness

The dialogical notion of classical validity is defined as follows:

A formula A is called classically valid if there is a classical winning
proponent strategy for A.

The following completeness theorem holds:

A formula A is classically valid if and only if A is provable in
classical logic.

A proof of this theorem for a sequent calculus for classical logic can be found in Sørensen and Urzyczyn [2006].

A dialogical formulation of classical logic has thus been obtained by a modification of the dialogue conditions for intuitionistic dialogues. This means that one can obtain dialogical formulations of different logics by changing the rules of the dialogue games.

6. Origins and Recent Developments

The dialogical approach to logic was first proposed by Lorenzen in 1958 (Lorenzen [1960]; see also Lorenzen [1961]) for intuitionistic logic as well as for classical logic. That the dialogical approach as such cannot be taken as a foundation of intuitionistic logic is obvious, since a dialogical notion of classical validity can be obtained by modifying the dialogue conditions given for intuitionistic logic. If one wants to obtain a dialogical foundation for intuitionistic logic, it is therefore necessary to give a justification for the special kind of dialogues needed for intuitionistic logic. Such a justification has been proposed by Felscher [2002] (first published in 1986); it is based on the notions of contention, hypothesis and relevance.

The dialogical approach has been extended to several non-classical logics, including modal logic and linear logic; for an overview see Rahman and Keiff [2005] and Keiff [2011]. A dialogical setting for the interpretation of implications as rules has been considered by Piecha and Schroeder-Heister [2012]. Dialogical logic can thus provide a common basis for discussing different kinds of logics.

7. References and Further Reading

  • Andreas Blass. A Game Semantics for Linear Logic. Annals of Pure and Applied Logic, 56:183–220, 1992.
    • Presents a dialogue semantics for linear logic. Starting point for game-theoretic developments in computer science.
  • Walter Felscher. Dialogues, Strategies, and Intuitionistic Provability. Annals of Pure and Applied Logic, 28:217–254, 1985.
    • Constructive completeness proof for intuitionistic logic.
  • Walter Felscher. Dialogues as a Foundation for Intuitionistic Logic. In D. M. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, 2nd Edition, Volume 5, pages 115–145. Kluwer, Dordrecht, 2002.
    • Explains the basic concepts of dialogical logic, gives an overview on the literature on dialogues, and develops an argumentative foundation for a certain kind of dialogues as a basis for intuitionistic logic.
  • Wilfrid Hodges and Erik C. W. Krabbe. Dialogue Foundations. Aristotelian Society Supplementary Volume, 75:17–49, 2001.
    • Critical discussion between Hodges and Krabbe on dialogues as a foundation for logic.
  • Laurent Keiff. Dialogical Logic. In E. N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Stanford University, Summer 2011 edition, 2011.
    • Overview on dialogical logic, including dialogues for modal, linear and other non-classical logics. The presentation uses an alternative formalization of dialogical logic.
  • Erik C. W. Krabbe. Dialogue Logic. In D. M. Gabbay and J. Woods, editors, Handbook of the History of Logic, Volume 7: Logic and the Modalities in the Twentieth Century, pages 665–704. Elsevier North-Holland, Amsterdam, 2006.
    • Traces the historical development of dialogical logic. Contains a useful bibliography.
  • Kuno Lorenz. Basic Objectives of Dialogue Logic in Historical Perspective. In Rahman and H. Rückert, editors, New Perspectives in Dialogical Logic, volume 127 of Synthese, pages 255–263. Springer, Berlin, 2001.
    • Describes the development of dialogical logic in historical context, with emphasis on the notion of dialogue-definiteness.
  • Paul Lorenzen. Logik und Agon. In Atti del XII Congresso Internazionale di Filosofia (Venezia, 12–18 Settembre 1958), volume 4, pages 187–194. Sansoni Editore, Firenze, 1960.
    • First proposal of dialogues as a means to explain intuitionistic logic and classical logic.
  • Paul Lorenzen. Ein dialogisches Konstruktivitätskriterium. In Infinitistic Methods. Proceedings of the Symposium on Foundations of Mathematics (Warsaw, 2–9 September 1959), pages 193–200. Pergamon Press, Oxford/London/New York/Paris, 1961.
    • Dialogical explanation of the meaning of logical constants and of the meaning of inductive definitions.
  • Thomas Piecha and Peter Schroeder-Heister. Implications as Rules in Dialogical Semantics. In M. Peliš and V. Puncochár, editors, The Logica Yearbook 2011, pages 211–225. College Publications, London, 2012.
    • Formulates dialogical semantics for implications-as-rules approach.
  • Shahid Rahman and Laurent Keiff. On How to Be a Dialogician. In D. Vanderveken, editor, Logic, Thought and Action, volume 2 of Logic, Epistemology, and the Unity of Science, pages 359–408. Springer, Dordrecht, 2005.
    • Survey of dialogical formulations of a variety of logics.
  • Morten Heine Sørensen and Paweł Urzyczyn. Lectures on the Curry-Howard Isomorphism, volume 149 of Studies in Logic and the Foundations of Mathematics. Elsevier, New York, 2006.
    • Contains a completeness proof for classical dialogues.
  • Wolfgang Stegmüller. Remarks on the completeness of logical systems relative to the validity-concepts of P. Lorenzen and K. Lorenz. Notre Dame Journal of Formal Logic, 5:81–112, 1964.
    • Contains a comparison of the dialogical approach to semantics with the Bolzano-Tarski approach.

 

Author Information

Thomas Piecha
Email: thomas.piecha@uni-tuebingen.de
University of Tübingen
Germany

F. H. Bradley: Logic

F. H. BradleyAlthough the logical system expounded by F. H. Bradley in The Principles of Logic (1883) is now almost forgotten, it had many virtues. To appreciate them, it is helpful to understand that Bradley had a very different view of logic from that prevalent today.  He is hostile to the idea of a purely formal logic. Today, deductive logic is largely restricted to a study of the rules through which we can legitimately re-arrange our thoughts, permitting the elimination of items no longer required, but not allowing the addition of anything genuinely new.  Bradley had a much wider conception and took logic to be the discipline through which we give an account and explanation of the special function of thought through which we transcend immediate experience.  Bradley believes logic covers topics that would fall today under the heading of theory of knowledge.

For Bradley, the processes of thought through which we transcend immediate experience involve ideas, judgments, and inferences.  He begins with judgment and offers a natural account of both relational judgments with more than one subject and judgments without a special subject, such as: “It is raining.”  His general theory that the ultimate subject of all judgment is reality as such could also accommodate the mass terms that give modern logicians so much trouble.

Although Bradley accepts the credo of empiricism that all our knowledge begins in experience, he does not accept Hume’s view that our immediate experience is composed by a swarm of impressions. He rejects the theory, widespread at the time, that knowledge could be explained through the association of ideas derived from such impressions.  Neither psychological particulars nor any connections among them are the sorts of thing capable of representing anything beyond themselves.  Judgment requires “logical” ideas that are universal, not particular.

What most baffles readers is an esoteric doctrine in which Bradley assimilates judgment and inference as processes in which there is a movement of thought from a ground to a conclusion.  Unless there is a change, nothing has happened, but any change requires justification, if the inference is to be valid or the judgment true.  For the movement of thought to be satisfactory, the ground and justification cannot remain external and must be brought inside.  This is achieved to the extent that we can enlarge our system of thought.  It may seem that Bradley is now heading to a Hegelian solution in which the completion of the system of thought brings about the identity of Thought and Reality, but Bradley is not prepared to go this far.  This is, however, a matter for metaphysics and is beyond the scope of logic.

Table of Contents

  1. Biography
  2. Bradley’s Conception of Logic
  3. Judgment
  4. Logical Ideas
  5. Categorical Judgments
    1. Universal Judgments
    2. Analytic Judgments of Sense
    3. Synthetic Judgments of Sense
  6. Hypothetical Judgments
  7. The Esoteric Doctrine
  8. Other Types of Judgments
    1. Negative Judgments
    2. Disjunctive Judgments
  9. Other Topics
    1. Logical Principles
    2. Extension and Intension
    3. Modality
  10. Judgment: Concluding Remarks
  11. The Nature of Inference
  12. The Association of Ideas
  13. Inductive Inference
  14. Inference: The Inclusive Theory
  15. Inference and Judgment
  16. Formal Logic
  17. Truth and Validity
  18. The Final Doctrine
    1. Inference
    2. Judgment
    3. The Fundamental Problem of Thought
    4. Immediate Experience and the Absolute
  19. References and Further Reading
    1. Selected works by F. H. Bradley
    2. Further Reading

1. Biography

Francis Herbert Bradley was born in 1846 into a very large family that included the celebrated Shakespearean critic, A.C. Bradley.  Having studied at Oxford University, F. H. Bradley was awarded in 1870 a Fellowship at Merton College, where he remained until his death in 1924.  He was not required to teach and did not do so.  The dominant philosophy in England when he came to Oxford was the (kind of) empiricism, originally due to John Locke, whose champion in the nineteenth century was John Stuart Mill. This theory attempted to explain cognition through the association of mental particulars, impressions and ideas, originally introduced into the mind, it was supposed, by external causes.  Bradley was implacably opposed to this position and determined to demolish it.  He gained assistance in this from his wide reading in German philosophy, but refused to call himself a Hegelian, since he denied the central principle of the identity of Thought and Reality.  Nonetheless, he is generally regarded as the central figure in the group of British Idealists in the late nineteenth century.

2. Bradley’s Conception of Logic

The principal source for Bradley’s thoughts about logic is a substantial two-volume work entitled The Principles of Logic, published in Oxford in 1883.  A second edition appeared in 1922, in which the original text was supplemented by a large number of additional notes and terminal essays through which Bradley expressed his mature position. (Page references in what follows will be to this second edition.)

Bradley had a very different view of logic from that prevalent today.  Today, logic is largely restricted to a study of the rules through which we can legitimately re-arrange our thoughts, permitting the elimination of items no longer required, but not allowing the addition of anything genuinely new.  Bradley had a much wider conception and took logic to be the discipline through which we give an account and explanation of the special function of thought through which we transcend immediate experience.  Logic, for Bradley, therefore covers topics that would fall today under the heading of theory of knowledge.

The processes of thought were traditionally taken to involve ideas, judgments, and inferences.  These topics, however, are very closely connected.  One could begin at any point, but Bradley proposes to begin in the middle with the faculty of judgment.

3. Judgment

Bradley’s central definition is as follows: “Judgment proper is the act which refers an ideal content (recognized as such) to a reality beyond the act.” (10) This definition immediately raises two serious questions: (1) What is this ideal content and how is it acquired? (2) What is reality and how is it accessed?  These are questions that Bradley tackles in considerable detail.  Moreover, the definition commits Bradley to the thesis that the structure of judgment is essentially subject-predicate, “that in every judgment there is a subject of which the ideal content is asserted.” (13)   The subject is what is real, and the predicate is the ideal content referred to it: judgment is essentially predication.

This is, of course, to display the form of the act or function of judgment.  It does not specify the essential structure of the ideal content, nor does it trap Bradley within the traditional logic of the categorical statement, as Russell believed.  Categorical statements involve the combination of two terms—a subject term and a predicate term—with the two terms united by the copula in such a way that the act of combination is the act of judgment.  Bradley resists this account on the ground that the ideal complex expressed is the same whether the proposition is asserted or merely entertained.  “We may say then, if the copula is a connection which couples a pair of ideas, it falls outside judgment; and, if on the other hand it is the sign of judgment, it does not couple.  Or, if it both joined and judged, then judgment at any rate would not be mere joining.” (21)  It is not even true that every judgment contains two ideas: on the contrary, it has but one.  The ideal content may be as complex as you please: it may be “a complex totality of qualities and relations” (11); but even if we distinguish separate ideas within the complex, it is as a unit that it is referred to reality.  When we assert that the wolf eats the lamb, it is the whole complex that is referred beyond the act of judgment, even if we distinguish within it the separate ideas of (at least) the wolf and the lamb.

Because we can distinguish separate objects such as the wolf and the lamb that can function as special subjects, we can draw at the level of logic a distinction between singular judgments that characterize single things and plural judgments in which a number of such things may be related.  But even with non-singular judgments, we must assume a unified reality within which various objects are assigned a place.

Bradley’s theory that relational judgments that appear to refer to a number of identifiable and discriminable individuals actually presuppose a single underlying reality gets confirmation from his logical analysis of a kind of judgment in which this reality is introduced directly.  This is the kind of judgment that denies the existence of things of a certain type, such as sea-serpents.  “Sea-serpents do not exist” has “sea-serpents” as its grammatical subject, but we must distinguish the grammatical subject from the real subject that confers a truth-value upon the statement.  Sea-serpents are not the reality to which we refer when making this judgment, since there are no sea-serpents.  The correct logical analysis is something like: “Reality is such that it contains no sea-serpents.”  This corresponds to: “Reality is such that A and B are simultaneous.”  Bradley can therefore handle this kind of judgment without presupposing the existence of what is denied.  What he presupposes is the reality that is the ultimate subject of every judgment.  The competing analysis offered by modern logic through the negation of existential quantification presupposes a universe of discourse comprising all possible values of the individual variables in the system.

Judgment has a dimension of truth and falsity, and Bradley uses this to confirm his view that judgment necessarily involves a reference to what is real.  “For consider;” he says, “a judgment must be true or false, and its truth or falsehood cannot lie in itself.  They involve a reference to a something beyond.  And this, about which or of which we judge, if it is not fact, what else can it be?” (41)  It may be thought that logical truths, said to be true in all possible worlds, are an exception.  For Bradley, logical truths, or tautologies, are not true in all possible worlds: they are not true in any possible world. “A bare tautology …is not even so much as a poor truth or a thin truth.  It is not a truth in any way, in any sense, or at all.” (Appearance and Reality, Note A, 501.)

4. Logical Ideas

Bradley’s definition of judgment introduces “ideal content.”  What is “ideal content” and how is it acquired?  Bradley was completely sure that the psychological particulars with which empiricists furnished the mind could not begin to explain judgment, knowledge, and cognition.  If such things existed, they certainly could not function as predicates in judgment, since they could not be moved from their place in the mind.

What Bradley had to explain was how we get from psychological ideas, which are mental particulars, to logical ideas, which are universal ideal contents, while preserving the information that the impressions have no doubt acquired from elsewhere.  He begins by distinguishing two sides that belong to every psychological idea—its existence as a mental particular and its content.  “We perceive both that it is and what it is.” (3)  Unlike existence, content can be loosened from its home in the psychological idea and transferred elsewhere—a loosening of content that takes place within the act of judgment.  It is not, however, the entire content of the psychological idea that is used in judgment.  The original content, he says, is “mutilated.” That the acquisition of ideal content involves abstraction is more clearly appreciated, if we move from the Humean picture of a swarm of distinct impressions arriving together in the mind to the notion of an organic immediate experience with which Bradley is more comfortable.  It is clear that the logical ideas used in judgment require the separation of elements within the “sensuous felt mass” presented in immediate experience.  Even if we begin, however, with an isolated impression or sense-datum, we must recognize that universals are associated at different levels.

Bradley makes an unsuccessful attempt to explain what he has in mind by using the notion of a symbol.  A symbol, such as a particular inscription, has, like everything else, two sides: its existence and its content.  But it has also a third side—its meaning or signification.  This meaning can be identified with the logical idea used in judgment. The symbol RED has as its meaning exactly what we assign to a variety of objects in the act of judgment.  This provides an opening for Frege and those who favor the linguistic turn to slip in an item distinct from any image or psychological idea that may be associated with the word.  (The logical idea is, of course, to be identified with what Frege calls the sense of the sign, not the referent.)  But the attachment of the idea to the symbol through decision or convention does nothing to explain the connection between the abstract universal and the immediate experience which must be its home.  It is only because we can abstract a part of the given content that we obtain the sense that we attach to the sign in the language.

5. Categorical Judgments

a. Universal Judgments

The standard classification of judgments distinguished categorical, hypothetical, and disjunctive.  Bradley reduces the universal form of the categorical judgment to a hypothetical form.  The universal form does not even guarantee the existence of real things to which we refer.  “All trespassers will be prosecuted” is designed to ensure that the subject class remains empty.  Thus, “Animals are mortal” becomes “If anything is an animal, then it is mortal.” (47) Bradley admits that he got this from Herbart, and Russell admits, in turn, that he got it from Bradley.

b. Analytic Judgments of Sense

Singular judgments, however, are different.  Bradley takes as his example: “I have a toothache.”  I and my toothache are both individual, but I describe my condition in general terms as “suffering from toothache.”  This example belongs to the first division of singular judgments that he calls “analytic judgments of sense.”  “The essence of these is to hold only of the now, and not to transcend the given presentation.” (56)  Analytic judgments of sense do not always have a grammatical subject or copula.  We may call the cry “Wolf” a warning, but it is also a statement of fact, or is supposed to be.  The cry of “Wolf” or “Rain” refers to an undifferentiated present reality. The thought is that a wolf is somewhere and that rain is everywhere, at least everywhere that matters.  But there are also singular judgments without grammatical subjects in which we qualify by our idea “but one piece of the present.” (57)  One way to do this is by pointing.  I point to my dog and say “Asleep.”  Bradley rejects the view that the grammatical subject is merely suppressed. Even if a grammatical subject may appear when my judgment is reported.

Bradley identifies a second kind of analytic judgments of sense that do have a grammatical subject. “The ideal content of the predicate is here referred to another idea, which stands as a subject.  But in this case, as above, the ultimate subject is no idea, but is the real in presentation.  It is this to which the content of both ideas, with their relation, is attributed.” (57)  “This bird is yellow” is a typical example.  The ideal content “bird”, perhaps aided by a pointing finger, is used to identify the particular object that is the special subject of the judgment.

In addition to analytic judgments of sense in which a real object is introduced through what we would now call a definite description, there are other cases in which a proper name is used, such as “John is asleep.”  The name “John” is bestowed to help us identify a particular person.  Bradley attacks the view that a proper name has a denotation, but no connotation.  The proper name is a sign connected with what it denotes, but I could not identify what it denotes without some descriptive content to help me recognize it.

c. Synthetic Judgments of Sense

The discussion of proper names allows Bradley to move to a second category of singular judgment-synthetic judgments of sense.  “Proper names,” he says, “have a meaning that always goes beyond the presentation of the moment.” (61) In using the name of a person, we assume an existence that goes beyond what is available in immediate experience, a reality that appears but is distinct from its appearance.  In a synthetic judgment of sense, “we make generally some assertion about that which appears in a space or time that we do not perceive.” (61-2) But how is this possible?  How can we make a judgment about a reality that appeared in the past, will appear in the future, or is now over the horizon, if we encounter reality only through presentation in immediate experience?  No idea can capture the uniqueness of the day that is last Tuesday.  We can form the idea of a certain kind of event: we can form the idea of an extensive history involving as large a sequence of events as you please, but such ideal contents cannot capture the unique past that actually took place, which alone can make the ideas we refer to the past either true or false.

For Bradley, the solution requires a crucial distinction between “this” and “thisness”.  Only this day is today.  Yesterday was today yesterday, but it is no longer today today.  Today is also a particular day distinct from every other day and has its own date.  It has its own position in a series of days within which every day is rigidly ordered through the relation of earlier and later. This series of days does not change, even when it is envisaged at different times.  It is therefore a universal ideal content, and each day within the series has particularity or “thisness”.  After McTaggart, the series has been known as the B-series.  This ideal series can be attached to reality, only through the identification of a particular day within it with the reality given in present experience, which will turn that day into “today.”  Once this is done, days that come after the day with that date are future days that will be real, and days that come before are past days that were real.  This introduces the McTaggart A-series.  To explain Bradley’s theory, the unit “day” has been used, although it does not appear in the text and involves an oversimplification, since we cannot identify an entire day with the present of immediate experience.  On the other hand, it would be a complete mistake to identify the immediate present with an instant or a moment, imagined as either the end product of the infinite division of a period of time or as the interface between adjacent periods.

Since we cannot introduce a reference to what really happened in the past or will really happen in the future, which synthetic judgments of sense seem to demand, through the construction of even the most complex and extensive ideal content constituting a history of a possible world, how is the feat to be accomplished?  Bradley’s solution is that although I can access reality only through a point of contact in immediate present experience, reality is not restricted to its appearance in my experience.  The problem of appearance and reality is metaphysical and requires another book; but even at the level of logic it is clear that the identity of reality and what appears in experience is not mandated.  “If the real must be ‘this’, must encounter us directly, we cannot conclude that the ‘this’ we take is all the real, or that nothing is real beyond the ‘this’.” (70) Being given in experience is not a quality of reality “in such a sense as to shut up reality within that quality.” (70)  An ideal content can be true “because it is predicated of the reality, and unique because it is fixed in relation with immediate perception.” (72) Since immediate perception may involve an experience of change, a fragment of the temporal series may be abstracted and extended indefinitely through an ideal process.

Bradley has one further move to make to introduce the idea of a particular fact. “The idea of particularity implies two elements. We must first have a content qualified by ‘thisness’, and we must add to that content the general idea of reference to the reality.” (77) Without the second element, we have members that are exclusive within the series, “but the whole collection is not unique.” (77) For absolute uniqueness, we require the connection of the series with direct presentation.  To think of tomorrow we may require a universal ideal content to connect it with today, but the day we think about is as unique as is today.

6. Hypothetical Judgments

Bradley handled universal judgments by reducing them to hypothetical form, but how can a hypothetical judgment be taken as true, since its antecedent is supposed, but not categorically affirmed?  Modern logic evades this problem by treating hypothetical statements as truth-functional, but this evasion has consequences. For Bradley, the hypothetical judgment involves an ideal experiment.  “The supposal is treated as if it were real, in order to see how the real behaves when qualified thus in a certain manner.” (86)  The connection of the components is what is asserted in the hypothetical judgment, and it is this that has its ground in reality.

Bradley believes that not only are all universal judgments hypothetical, but also that all hypothetical judgments are universal.  This may be thought doubtful, since there seem to be exceptions.  “If this man has taken that dose, he will be dead in twenty minutes.” (89)  This would not be necessarily true of any man who took the dose; but if the judgment is true, there will be some universal connection, even if restricted to the case of that specific man.

Bradley is assuming that the truth of a hypothetical statement must depend on some (possibly) latent feature of reality.  Singular judgments, however, appear to connect us more directly with solid fact.  The synthetic judgment of sense has its special status as categorical because of its connection with a reality actually given.  It therefore depends on the analytic judgment of sense which assigns an ideal content to that given. Bradley has already argued that all universal statements are hypothetical.  This is now widely accepted.  He now moves to the startling claim that all singular statements are hypothetical, which he recognizes as an “unwelcome conclusion.” (91)   Construed as categorical, analytic judgments of sense are all false, because they do not provide the whole truth about what is given in immediate experience, far less the whole truth about reality.  This follows from his original story that an ideal content used in judgment is limited to part of the content of the given reality.  But to say that the judgment is not the whole truth is not to say that it is not wholly true and hence partly false, even false tout court. Bradley complains that the choice of an ideal content to qualify the immediate given is arbitrary.  Arbitrary is too strong, since the choice may very well have a purpose, but even if it were arbitrary, the assignment of universal content to the given reality would be just as true as the choice of any other content from the selection available.

Bradley is suggesting that the loosening of part of the content of the given reality that he introduced earlier as the very essence of thought is doomed to failure in advance.  This is why he talks about “mutilation”.   But the success or failure of the operation is surely relative to what it is intended to achieve.  It is not designed to provide an ideal content that will be a complete characterization of reality as a whole; it has surely a much more limited aim.   One idea is that loosening a part of the content is associated with separating out a segment of the given reality that conforms to the concept introduced.  Loosening the concept of a dog from what I am given allows me to separate out Fido and perhaps other dogs within my field of view.  The analytic judgment of sense that here is a dog would appear to be categorically true.  This way of explaining the function of the judgments immediately associated with the loosening of ideal contents would allow Bradley, were he so minded, to make peace with logical systems, such as both Aristotelian and modern logic, that give a central position to the individual object. (This is essentially the problem of “special subjects”, discussed in Campbell: 1967.)

7. The Esoteric Doctrine

We have now come to a parting of the ways.  If we accept the truth of analytic judgments of sense, such “judgments that analyze what is given in perception will all be categorical.” (106)  Abstract, universal judgments will all be hypothetical.  Synthetic judgments “about times and spaces beyond perception” (106) are also categorical, although they require inferences that rely on the universal.  Bradley is prepared to allow those who lack the courage to follow him to a more esoteric theory “to remain at a lower point of view.” (106)    Bradley, however, proposes a trip to a region where the “distinction between individual and universal, categorical and hypothetical, has been quite broken through.” (106)  It is at this higher level that Bradley’s logic becomes so difficult, perhaps impossibly difficult.  At the lower point of view, we separate out individual objects that we characterize through universal properties and relations in singular and plural judgments.  Bradley begins the move to what is higher (or deeper) with the point that these individual objects are conditioned by the setting in which they are found.  They are not unconditioned, but are asserted subject to a condition.  What is subject to a condition can be asserted categorically, if the condition is taken as satisfied.  Bradley is well aware that conditional and conditioned are not the same.  “A thing is conditional on account of a supposal, but on the other hand it is conditioned by a fact.” (99)  His argument is that for anything with a setting in space and time, the condition can never be satisfied.  To introduce the series of conditions in space and time is to introduce a chain whose last link hangs unsupported in the air. This is a worrying argument, traditionally used to prove that the world must have a beginning in time (perhaps also a First Cause), or else by Kant to vindicate transcendental idealism.  The assessment of how far it provides a solid support for what Bradley proposes to build on it will be postponed until 18b.

Rejecting the categorical judgment that assigns an ideal content to the segment of reality from which it has been loosened, Bradley is left with no more than hypothetical judgments.  These cannot even be our standard hypothetical judgments that are composites of categorical statements.  They are mere husks, connecting adjectives For example, “If lightning, then thundering.”  Certainly, hypotheticals that connect adjectives are in a way also categorical, since they affirm a ground of connection in reality.  But we have lost our standard hypothetical judgments and are left with mere scraps.  Even more baffling is the replacement we are offered for a singular judgment in the higher point of view.  “Instead of meaning by ‘Here is a wolf,’ or ‘This tree is green’ that ‘wolf’ and ‘green tree’ are real facts, it must affirm the general connection of wolf with elements in the environment, and of ‘green’ with ‘tree.’” (104)

Bradley offers a further explanation of his “unwelcome conclusion” in Terminal Essay II, which I discuss in 18b and offer a way of escape.  In the meantime, he returns from the heights and provides a more mundane account of other kinds of judgment.

8. Other Types of Judgments

a. Negative Judgments

Bradley now turns to negative judgments.  Negative judgments, he believes, are more complicated than affirmative, since they must begin with a suggestion that is rejected in the judgment.  Moreover, this rejection must depend on the assumption of a positive ground of exclusion, even if what this is may not be known. Negative existential judgments are of particular interest. In “Ghosts do not exist,” the grammatical subject cannot be the real subject; the real subject is the nature of things to which we deny the quality of harboring ghosts.  The positive character of reality that excludes ghosts is not, however, determined through the negative judgment.  This entails that the same character of the real may exclude a variety of different suggestions.  The suggestions excluded have their source in an ideal experiment and not in the nature of reality.  The negative judgment affirms that some quality of the real excludes a suggestion, but it does not determine what quality that is.  The truth of a negative judgment depends on a quality of the real incompatible with the quality excluded in the judgment.  The true quality and the quality assigned in the judgment are thus contraries and not contradictories.  The way in which a negative judgment presupposes a quality in what is real that we may not be able to specify may be compared with the way in which a hypothetical judgment presupposes the same kind of quality as grounding its connection.  It follows that the negation of a hypothetical judgment would be the rejection of this sort of ground.  The mere assertion of the antecedent and the negation of the consequent is indeed incompatible with the hypothetical judgment, but it is not its contradictory.  A genuine contradictory would be strong enough to rule out counterfactual conditionals.

b. Disjunctive Judgments

Bradley understands disjunction as providing a list of two or more mutually exclusive alternatives.  He is willing to associate disjunction with a nest of hypothetical judgments, but since neither the hypothetical judgments nor the disjunction are truth-functional, the disjunctive judgment may have a certain categorical aspect. “Disjunctive judgment is the union of hypotheticals on a categoric basis.” (131)

Bradley connects disjunction with choice, where we make a selection from a number of alternatives.  There is a definite list of possibilities; this is its categorical feature.  We cannot use disjunctive addition to add in an arbitrary fashion another disjunct that is not a real possibility.  In the same way, to say that something is colored is associated with a list of possibilities from which we select the actual color.  To produce the disjunctive judgment that lists the varieties of color is to assign to the object categorically the property of being some kind of color, even if we do not know which color it is.

This example conforms to the template that Bradley favors in place of the form “either p or q or…” that is used today. Bradley treats the disjunctive judgment as a kind of singular judgment, with the format “A is either b or c or d….” This analysis will run into difficulties when A does not exist, but Bradley has met this problem before, and deals with it by replacing the grammatical subject with the real subject.  This maneuver can even handle cases that seem most recalcitrant, such as “Either the light bulb is dead or the fuse has blown.”  This would become: “Reality is either characterized by light bulb malfunction or fuse meltdown.”

9. Other Topics

a. Logical Principles

Chapter V examines logical principles.  Bradley dismisses the Law of Identity as an empty tautology. Judgment requires the identity of differences, not provided by “A is A.” This means that the accusation (by Bertrand Russell) of confusing the “is” of predication with the “is” of identity cannot be fair, since for Bradley predication is the essence of judgment, whereas through the “is” of strict identity we do not make a judgment at all.         

The most interesting part of the section on “The Principle of Contradiction” is the discussion of (Hegelian) dialectic.  Bradley’s simple solution is that if the ideas combined in the synthesis are merely different, there is no problem. The ideas of self and other are different ideas, but no one would say that it is a contradiction to assert the existence of the self and other things as well.  The challenge to the principle of contradiction comes, only if the different ideas combined are taken to be discrepant or contrary, since the contrary of a given proposition entails its contradictory. Bradley offers a compromise according to which ideas that appear to be contrary are reconciled when harmonized within a wider reality.  For example, opposite properties can be assigned to the same thing at different times.

The Law of Excluded Middle takes the form of a disjunctive judgment and would be expressed today as “either p or not p.” Bradley, however, has a different form for disjunction, so that his version of the principle will be: “A is either b or not-b.” A is not always a real particular thing, but sometimes reality as such.  Indeed, if Bradley gets his way, the ultimate subject will always be reality.  Excluded middle uses the variety of disjunction in which the number of disjuncts is exactly two.  When the second disjunct is constructed as the negation of the first, there can be no other choice.

b. Extension and Intension

Bradley next tackles the familiar distinction between intension and extension in the chapter on the quantity of judgment, explaining that “in every symbol we separate what it means from that which it stands for.” (168)  (Frege’s distinction between sinn and bedeutung.)  His account of the extensional treatment of universal judgments such as “Dogs are mammals” is disappointing, because he fails to register that a set is a special kind of entity, suggesting that a set of dogs must be a pack of dogs, failing which the only alternative is the ludicrous idea of a collection of dog-images in the mind!  With a proper notion of set in place, “Dogs are mammals” can be taken to assert a relation between two sets, just as many other judgments assert a relation between two objects.

Judgments founded on intension refer to the connection of attributes and meanings, and ignore the denotation of objects.  Universal judgments based on meanings are those Kant considers strictly universal, because they do not permit even the possibility of exceptions. Not all universal judgments are of this type, and singular judgments never are. Our concept of what is real, denoted in a singular judgment, is the concept of the individual, which is both particular, excluding all other individuals, and universal, as unifying various characteristics and constituting an identity in difference.  The real individual is a concrete universal: abstract universals, which can be separated from the individual in thought and applied elsewhere, cannot be real.  In a similar way, what is truly individual is a concrete particular; abstract particulars that are nothing more than their distinction from other particulars are also unreal.  “A reality in space must have spatial diversity, internal to itself.” (188)   A point in space is distinct from all other points, but is a mere abstraction.  A moment in time is also an abstraction; a concrete individual existing in time must have some duration.

c. Modality

Bradley rejects as erroneous the view that modal differences do not affect the actual content of the judgments involved.  Certainly, you can take any judgment and “express any attitude of your mind towards it.” (198)   These propositional attitudes are many and various.  I may say: “I wish to make it” or “I fear to make it” or “I am forced to make it.”  “All these are simple assertorical statements about my condition of mind.” (198) Statements about possibility and necessity do not, however, express my state of mind.  They are assertions that claim objective truth.  “There clearly can be but one kind of judgment, the assertorical.  Modality affects not the affirmation, but what is affirmed.” (197)  This is in line with the logic of Principia Mathematica, in which everything takes place under the aegis of the assertion sign.  In this system, there is not even a corresponding negation sign, just a sign for the negation of a proposition.  This is more extreme than Bradley, who does allow a distinct function of negation.

Thus, judgments of necessity and possibility have a special content not to be found in the corresponding assertoric judgment.  For Bradley, “The possible and the necessary are special forms of the hypothetical.” (198) Necessity consists in a necessary connection between antecedent and consequent in a hypothetical judgment.  To say that a fact is necessary is not to elevate it to a higher status, but merely to say that it is a necessary consequence of some other state of affairs, also taken as fact.  As already explained, the connection through which the antecedent necessitates the consequent must itself depend on a categorical ground.  This includes cases where we assert a necessary connection, because of a regular succession of events.  Not that this ground has to be a necessary causal connection.  “The real connection which seems the counterpart of the logical sequence, is in itself not necessary.” (206)

Bradley also connects the possible with the hypothetical.  To say that something is possible is to say that some of its conditions are satisfied, excluding those specified in the antecedent of the associated hypothetical statement.  “It is possible to see an eclipse of the moon tonight” means “If you get up early enough and the weather co-operates, you will see an eclipse of the moon.”  To assert a potentiality or power or disposition is to commit to a hypothetical judgment stating that if certain other conditions are satisfied, a certain state of affairs will necessarily come to pass.

Bradley has a problem with modality because of his metaphysical vision of a Parmenidean Absolute Reality.  Modal distinctions come to life with the conception of an open future, in which some things are unavoidable and others are possibilities among which we may choose.  What is actual at the present time cannot be properly said to be either possible or necessary (Bradley gets this right!); although some things that have taken place were necessary and others were not.  Without this kind of background, the conceptual scheme Bradley is discussing would not exist.

10. Judgment: Concluding Remarks

In his presidential address to the American Philosophical Association in 1957 “Speaking of Objects,” W.V. Quine presents the manifesto for the position of modern logic.  “We persist in breaking reality down somehow into a multiplicity of identifiable and discriminable objects to be referred to by singular and general terms.  We talk so inveterately of objects that to say we do seems almost to say nothing at all; for how else is there to talk?” The reality to which Quine referred at the beginning disappears under the carpet and is heard from no more.  For Bradley, the reality that is broken down is, and has to be, the reality available in immediate experience.  It is broken down through the faculty of thought and judgment, which introduces distinct individuals characterized through universal logical ideas. This makes possible singular and plural judgments involving qualities and relations.  Not all judgments about what is real conform, however, to this template.  There are genuine judgments about reality that bypass a reference to real individuals.  Some such judgments modern logic may handle in other ways, but there are some that remain troublesome, such as judgments involving mass terms.  Bradley’s system of logic is more flexible and can handle the variety we find.

The strength of Bradley’s theory of judgment is the flexibility through which it accommodates a variety of forms.  Its weakness is that through insisting that the ultimate subject of judgment is reality, he seems to undermine the legitimacy of the singular and plural judgments on which we normally rely. One way to retain Bradley’s logic while rejecting the absolute monism of his metaphysical theory is to recognize that “reality” is itself a mass term.  The later developments in the logic of mass terms that are proving such a headache for modern logic also make more palatable the logic of Bradley.  Concepts, like “gold”, which do not by themselves package reality into units in the same way as count nouns like “dog”, can be used in various ways.  They can be used in a singular judgment to refer to a piece of gold: they can be used in plural judgments to refer to pieces of gold: and there is also a third use, as in “Gold is yellow,” where the concept is associated with a mass term.  (Interestingly, Bradley uses this very example (46) without noticing its special character.)  The possibility of this third use surely does not invalidate the other uses in singular and plural judgments.

This explanation of the process described by Quine is, of course, given at Bradley’s lower point of view, but the use of a mass term to designate the setting for the individual object, in place of a string of other individuals, may well discourage the desire to move to the mysterious higher view.  To isolate within the sensuous felt mass, designated by a mass term, an individual object associated with an ideal content loosened from what is given, seems about as good an account of the process of thought as we can get.

11. The Nature of Inference

Bradley moves on in Books II and III to the important topic of inference.  There is a problem emerging from the distinction between analytic and synthetic judgments of sense introduced in Book I, in that the synthetic judgments move us beyond what is given in immediate experience and must involve some kind of inference.   In a book on the principles of logic, Bradley must also engage with the traditional doctrine of the syllogism, which was taken to be the core of deductive inference.  Bradley proposes in the second book to deal with deductive inferences generally agreed to be valid, without probing too deeply, then moving in a third book to a fundamental theory intended to cover all forms of inference.

He begins by setting out three features of inference with which it is difficult to disagree.  First, the conclusion of an inference depends on a process of thought through which it is reached.  Second, the process rests on a basis.  “In inference, we advance from truth possessed to a further truth.” (245)  Third, there must be a difference between basis and conclusion; otherwise, the supposed inference is a “senseless iteration.” (246)

Bradley makes a list of forms of deductive inference, casting his net more widely to capture specimens that do not usually appear in the textbooks of the day.  The traditional syllogism cannot be taken as fundamental, since it does not cover all the forms that Bradley has listed, such as those empowered by transitive relations.  Bradley describes the process of inference as an operation of synthesis which “takes its data and by ideal construction combines them into a whole.” (256) Logical connection, however, requires the identity of common links, such as the middle term in a syllogism.  The first step is to form the whole: the second step is to extract the conclusion perceived within the whole by omitting parts that are no longer of interest.  Bradley denies that there is any general principle that will serve as a test of the validity of reasoning. The traditional syllogism is not up to the job and no replacement can be found.

The common link required to combine premisses is both the same and different.  “If it were not different it would have nothing to connect, and if it were not the same there could be no connection.” (288)  But how can we have both identity and difference?  The solution is that the common term is an ideal content “appearing in and differenced by two several contexts.” (288)

The process of inference depends entirely on this identity in difference.  There are, however, two radically different kinds of identity that Bradley does not distinguish at this point.  There are universal characters which are identical throughout their various instantiations (abstract universals) and there are individual objects that remain identical throughout their various appearances (concrete universals).  These individuals may even combine characters that are in some sense discrepant, if they are extended in space or enduring in time.  Caesar was in Gaul, and Caesar was in Italy.  Both types of identity in difference can provide a ground for inference, even within traditional syllogistic logic.  By suggesting that inference takes place only through the development of an ideal content and not via reference to an individual object, Bradley undermines the singular judgment and prepares the ground for a logical doctrine that downgrades it.

12. The Association of Ideas

The “association of ideas” is the name for a process that exists as a psychological fact; what Bradley is attacking is the empiricist account of this fact and the use of it to explain judgment and inference.  The empiricist theories of David Hume and John Stuart Mill attempt to explain the life of the mind in terms of the association of ideas that are distinct existences or psychological atoms.  The laws of association usually recognized are contiguity and similarity.  Bradley argues that the empiricists do not have the resources even to state clearly their central position, and offers the following restatement: “Any element tends to reproduce those elements with which it has formed one state of mind.” (304) He calls this law “redintegration”, getting the term from Sir William Hamilton.  The use of the qualification “tends” is standard for laws of association.  Bradley insists that his law “does not exclude any succession of events which comes as a whole before the mind,” (305) which is, of course, vital for the explanation of causal inference.

In spite of a superficial resemblance, there is a chasm that divides Bradley’s redintegration and the association of the empiricists.  Association is cohesion between psychical particulars: redintegration concerns the connection of universals, “which is an ideal identity within the individuals.” (306)  Only an ideal connection in the mind can survive the disappearance of connected individuals.  The impressions originally given in conjunction are gone and cannot be resurrected.  Only the universal ideal content, the “what” as opposed to the “that” is left behind as a memory trace.  Through the universals, we may perhaps be able to produce images that are, as it were, ghosts of the past, but these images will be fresh particulars and distinct existences that can be considered re-incarnations of the past, only in virtue of an ideal identity preserved through the universal.

In the empiricist theory developed, for instance, by John Stuart Mill, the bare contiguity of impressions was not considered to be by itself sufficient to operate the mechanism of association.  Past contiguity can be operative only if the memory thereof is introduced through the similarity between a component in a past experience and a sensation now being enjoyed.  But we still face the problem: “What has been called up has never been contiguous; and what has been contiguous cannot be called up.” (318) Not even similarity can resurrect what is now dead and gone.  Similarity can exist, only if the similar terms both exist. Therefore, reproduction through similarity is not possible, since the similarity requires that what is reproduced is already there.

There are few traces surviving today in either psychology or philosophy of the theory demolished by Bradley.  The violence of the rhetoric, although amusing, might be considered excessive, but in its day the theory was solidly entrenched, and dynamite may have been justified.

13. Inductive Inference

It seems that we often make inferences from particulars to particulars.  We take note that Fido barks when approached by a stranger; we infer that Rover will do the same.  Bradley denies that such inferences tacitly involve the inductive generalization that all dogs bark when approached by strangers, since people quite happy to make the inference from Fido to Rover might be reluctant to issue a general guarantee for all inferences of this type.  This does not mean, however, that universals are not involved.  The inference to the barking of Rover is based on a connection of ideal content, acquired through the encounter with Fido.

Bradley now turns to inductive generalization through which we reach a conclusion about all members of a certain class when only some members have been examined. This arena is the stamping ground of John Stuart Mill against whom Bradley directs his fire. Even if Mill’s Methods may be useful, standard textbooks agree that they are not logically sound. Bradley endorses the usual criticisms, and adds the point that in any case they do not take us from mere particulars to general truths, since the facts from which they begin are already conceptualized as instances of general kinds.

14. Inference: The Inclusive Theory

The story so far is that inference operates by combining premises that contain a ground of identity.  A conclusion is reached by eliminating the middle term.   Bradley now recognizes that this theory will not cover all forms of reasoning and sees the need for a third book in which to put things right. The original theory will handle the syllogism and many other arguments. What it does not cover is arguments where there is no elimination of a middle term, where the conclusion emerges as a structure incorporating A, B, and C on the basis of information relating A to B and B to C.  An example may clarify what Bradley has in mind.  We connect a day to the day before through the identity of the intervening night and the same day to the day after through a similar process.  In this way we construct a succession of days that will constitute a history.  This result will count as the conclusion of an inference in the wide sense.

Mathematics is also important in our cognitive life, and often not covered by the theory in Book II.  Other exceptions are the processes of comparison and distinction.  These are mental operations resulting in judgment, and are therefore inferences. Recognition is also inference, when we make the move from the perception of the man entering the room to the recognition of someone seen before.

Hegelian Dialectic also transcends the pattern permitted in the original theory.  Bradley offers a heretical version that tones down the excesses of the orthodox view.  Instead of supposing that the process begins in contradiction, Bradley suggests that our unrest begins in the recognition that the original datum is incomplete.  The dialectical move is to complete the incomplete through positing a larger whole in which it is a component.  This larger whole is itself seen to be incomplete, and the process is repeated.  The way in which the incomplete is completed has its source in the subject.  Although a dialectical move may have a source in past experience, the inferential move goes directly from the datum to what lies beyond, even if we are able sometimes to uncover a hypothetical judgment expressing the function that controls the inference.

Bradley is now ready to unveil general characteristics of inference.  Because it is intended to cover all cases, this will have to be vague.  In the beginning is a datum or data, followed by a mental operation, producing a result.  For example, in the inference: “A to the right of B, and B of C, and therefore A to the right of C” (432), we begin with “two sets of terms in relations of space” (432) and put them together.  This act of construction makes a difference, “but it does not make such a difference to the terms that they lose their identity.” (432-3)   Nor do A and C change their identity when directly related in the conclusion.   Inference makes a change, but it does not change the world.  Bradley often describes inference as “ideal experiment.” It is a movement of thought that we make, but we are not compelled to take this path.  If we have several premises, we are not compelled to put them together.  The act of combination is arbitrary, in the sense that it is something that we choose, but might not have chosen.  The act of inference is not a revision of the original data, although it introduces a fresh thought.

This makes sense where there is more than one premiss and an act of combination is required that depends upon the will of the agent.  But Bradley discovers many inferences where the conclusion issues through the development of a single premiss. Certainly, there is no inference without mental activity in which we begin with a datum and end with a judgment predicating a fresh characteristic; but does such intellectual activity all count as inference? Standard inference involves “a construction round an identical centre” (457), but there are non-standard inferences in which there seems to be no given identity.  However, the middle process, the operation leading from datum to conclusion, cannot “dispense with all identity.” (457)   The mere co-presence of all my thoughts is not enough, since this does not explain the special identity that enables the inference. Take “recognition” and “dialectic”, where we are given a real thing with a quality and infer another quality.  The inference depends on the connection of these qualities, and we might want to say that the middle term is the given quality.  The problem is that the connection of the qualities is neither explicit nor given. “It is a function of synthesis, which never appears except in its effects.” (458) “It is a construction by means of a hidden centre.” (458)Bradley distinguishes two operations associated with inference: synthesis and analysis.  In synthesis the many become one; in analysis the one becomes many.  Bradley makes a further distinction between analysis and elision.  We may begin with a judgment about a given whole, move by analysis to a plural judgment about its elements, and then by elision reach a conclusion about specific elements.  Central cases of inference in which premises are combined and a middle term eliminated involve both synthesis and analysis, but there are other inferences in which one or other operation is at least predominant.

Although they are different functions, analysis and synthesis have an intimate connection.  In analysis, the elements in the result are separated, but this means that they are also combined in a latent synthetic unity.  In synthesis, elements are combined, but the unity formed will be capable of analysis into the original components.  “Analysis is the synthesis of the whole which it divides, and synthesis the analysis of the whole which it constructs.” (471)  The crucial idea is the idea of the whole that analysis disassembles and synthesis constructs.  In analysis we operate on an explicit whole that falls into the background.  In synthesis we bring out the invisible totality comprehending the elements combined.

15. Inference and Judgment

With this wider conception of inference, it is getting harder to separate inference and judgment.  Certainly, synthetic judgments of sense involve a substantial inferential component, but even a judgment that comes straight from presentation seems to involve the analysis and synthesis that is characteristic of inference.  Judgment involves abstraction from the sensuous felt mass, and hence analysis.  Judgments assigning various characters to reality involve synthesis.  Bradley is certainly anxious to retain the distinction between judgment and inference.  “Inference is an experiment performed on a datum,” whereas in judgments of perception “there is properly no datum.” (479)  They do, indeed, have a basis, but this basis is for the intellect nothing.  “It is a sensuous whole which is merely felt and is not idealized.” (479)  Judgment is required to provide the ideal content from which inference takes its start.  In judgments of perception we have no rational ground to justify our result and “the stuff, upon which the act is directed, is not intellectual.” (480)  We can now, perhaps, make this clearer by explaining that the stuff in question is designated by a mass term.

The distinction between judgment and inference may not, however, be as sharp as one might like, as becomes clear when Bradley discusses the beginnings of our intellectual life.  “The earliest judgment will imply an operation which, although it is not inference, is something like it; and the earliest reasoning will begin with a datum, which though kin to judgment, is not intellectual.” (481) “Experience starts with a stimulation coming in from the periphery [what John McDowell calls ‘a brute impact from the exterior’]; but….the stimulation must be met by a central response.” (481) Sensations do not “simply walk into the mind.”  They are “the product of an active mental reaction.” (482)  The senses may give us sensations, but “the gift contains traces of something like thought.” (482)   The interface between cognition and the sensory input is murky indeed, but two things are clear. The response to the stimulus is not entirely arbitrary, nor is it a simple re-enactment of a given.  Nothing is given until it is received!

16. Formal Logic

Bradley is hostile to the idea of a purely formal logic whose goal is to construct a system of valid patterns of inference, covering all cases through the use of blanks and variables.  Partly, he does not believe that the goal can be achieved.  More basically, his concern is that the attempt to reconstruct inference in terms of the manipulation of counters in accordance with rules breaks the connection between inference and that continued reference to reality that lies at its heart.

Inferences do, indeed, proceed in accordance with principles, and we can reject a principle employed by finding another similar inference in which the premiss is true and the conclusion false.  In a particular inference, we can distinguish the principle from the matter involved, but we should not separate it and turn it into a major premiss in order to exhibit the argument as a syllogism.  The principle is not a premiss, because it is not a datum but a function. There may sometimes be a point in replacing the original argument with such a syllogism, but this option will not always be available. Every inference depends on a principle that is not a premiss, as Lewis Carroll has shown in “What the Tortoise Said to Achilles.” Even Principia Mathematica has the Law of Substitution and the Law of Detachment that are not axioms of the system!

17. Truth and Validity

So far the focus has been on the phenomenology of inference.  But inference is important, not because it takes place, but because it is taken to have validity and justification.  The problem is to explain how inference can have validity and justification in the face of the fundamental dilemma that Bradley identifies.  Unless there is a transition from the premiss to a different conclusion, nothing has happened, and there is no inference; but if there is a difference between premiss and conclusion, how can we justify the intellectual move?  Bradley dismisses the extreme claim that since they are different, there is an actual contradiction between premiss and conclusion.  To assert the premisses is not to deny the conclusion: it is merely to fail to assert it until the inference is completed.  But how is the eventual assertion of a different conclusion to be justified?

Logicians who do not challenge the legitimacy of the analytic judgment of sense can form a concept of truth that will allow them to explain that what is crucial for a valid inference is not that there be no change from premiss to conclusion, but  merely that there be no change in the truth value from true to false.  In the case of valid deductive inference this is guaranteed, because we merely re-arrange our information to make a certain element more salient.  What changes is merely our knowledge of the relation implicit in the premisses.  The act of inference requires an intervention by the subject that is arbitrary in the sense that it might not have taken place; but in the case of valid deductive inference, it is not an intervention that tampers with the truth.  There is, perhaps, more interference by the subject when a decision is made to eliminate part of the original ideal content, as when we drop the middle term in the conclusion of a syllogism.  Dropping ideal content even makes it possible that the conclusion is true, when the premisses contain error; but this does not matter, so long as it remains the case that if the premisses are true, the conclusion must also be true.

Perhaps deductive inference can be handled, if we do not probe too deeply, but Bradley now comes to a “rising sea” of non-deductive inferences that are not so easily controlled.  In mathematical construction we may infer the extension of a given straight line to double its size, but this is not the deduction of a conclusion from a premiss.   Comparison and distinction are also acts of the mind that are not deductive inference.  It could be argued, indeed, that these acts are not in fact inferences at all, but rather forms of plural judgment, originally involving more than one object distinguished within immediate experience.  Bradley, however, would not be greatly interested in this, since in his final view the distinction between judgment and inference is to be broken down.

The really serious problem, however, is empirical inference, including the prediction of the future on which we rely so heavily to carry out our purposes.  Bradley took the first step at the beginning of The Principles of Logic when he introduced the loosening from the given experience of an ideal content that can be transferred elsewhere.  This may explain how it is possible to formulate a belief about what will happen, but it does not explain why we choose to adopt the beliefs we do, or how these beliefs are to be justified.  Suppose we abstract from immediate experience a conjunction of ideal elements.  This may tempt us to imagine a similar conjunction in our representation of the future, but this would be justified, only if the connection of the elements were unconditioned and necessary.  Since in abstracting the conjunction from the given experience it has been separated from the context in which it was found, it remains, as Bradley believes, conditioned by that context.   Since this context is never completely known, the successful transfer of an ideal complex abstracted from the given context to a fresh context that may well be different cannot be guaranteed.

The recognition of the context in which the given ideal content is embedded undermines its guaranteed transfer elsewhere.  Does it also undermine the analytic judgment of sense that predicates the content of immediate experience?  This is what we are led to think in the move to the higher point of view, and it would be extremely serious, since it would destroy the very concept of true judgment.  It is ironic that at the beginning of The Principles of Logic Bradley uncovers the source of true judgment in the predication of an ideal content of an immediate experience from which it has been loosened and with which it is necessarily connected.  This explains how it is possible to transfer an ideal content extracted from immediate experience to a segment of reality not immediately experienced.  Such judgments, of course, may be either true or false.

This system is available as a lower point of view for those who are unable to follow Bradley all the way.  (It is also there as a fallback position, in the event that a fatal flaw is discovered in Bradley’s advanced reasoning, although Bradley himself does not seem to fear this possibility.)  The lower point of view is happy enough with the argument that empirical inferences have no logical guarantee, since the given object involved in the premiss is embedded in a context, ultimately unknown.  This argument establishes a conclusion to which everyone would agree.  What cannot be accepted is the use of the same fact to break the tie between ideal content and object that constitutes true judgment.  Without a viable concept of true judgment, even inference as we normally understand it will disappear, since the premisses and conclusion of an inference are all judgments, and a deductive argument is valid, if the conclusion must be true when the premisses are all true.

18. The Final Doctrine

a. Inference

We have been following the argument in the first edition of The Principles of Logic, in which Bradley tries to keep out the influence of his own metaphysical ideas, when operating at the lower level.   This is fortunate, because it makes Bradley’s often insightful discussion available to logicians who would be appalled by his metaphysics.  Bradley, as we know, is not ultimately satisfied with the lower point of view and feels compelled to move to a different position, where the influence of his metaphysical views can be detected. This difficult theory was not well understood, so that in the second edition of The Principles of Logic he included a set of terminal essays, which he hoped would provide a clearer exposition of his final views.

The original book began with judgment; the terminal essays begin with inference which he now moves to the center.   “Every inference is the ideal self-development of a given object taken as real.” (598)  This definition attempts to explicate inference without using the notion of judgment, which will later be explained as a kind of inference.  Even the third member of the logical trinity, the universal idea, is partly concealed under cover as “the given object.”  The given object must be ideal, since this is the only kind of entity capable of ideal self-development.  Bradley’s definition of inference would have been much clearer, if he had explained it as the ideal self-development of a logical idea taken as real.  The concept of ideal self-development, however, contains a problem, encountered before.   If there is no change, there is no inference; but if there is change, then “the inference is destroyed.”(599)   Bradley cannot take the usual line that the transition in inference from judgment to judgment is valid, so long as the preservation of truth is guaranteed.  This would be circular, since he intends to explain judgment in terms of inference. Bradley’s solution relies on the double nature of the datum, considered in itself and as part of a systematic whole. This is what is involved in the reference of the ideal content to reality.  This reference to reality, familiar from Bradley’s initial account of judgment, now turns out to mean “taken to be real, as being in one with Reality, the real Universe.” (598)  This is the point of “taken as real” in the original definition.  To take an ideal content as real is to identify it with Reality, in so far as it belongs to Reality.

We can now perhaps understand why Bradley replaces “logical idea” with “given object” in his initial definition.  A logical idea can only be a part of a system of logical ideas, a system of thought.  A given object, as normally understood and as understood within Bradley’s lower point of view, is a part of the real universe.  It is the act of judgment that connects the domain of thought with the real world.  It is judgment that predicates a logical idea of reality or of an object that belongs to reality.  Without judgment, the only possible movement of thought is a movement along a stream of ideas.  The only thing more real than a logical idea is a complete system of all ideas, and we have fallen into the clutches of Hegel!  To adopt the term “given object” to denote logical ideas makes it difficult to use the same term to introduce concrete individuals constituting the universe.

The movement of inference can be illustrated in the Dialectical Method, in which we expand a given content through recognition of its incompleteness.  The explicit premiss is “some distinguished content set before us.” (601) Implicit is “the entire Reality as an ideal systematic Whole.”  “Every member in this system…develops itself through a series of more and more inclusive totalities until it becomes and contains the entire system.” (601)  When I use this method, everything is necessary except where I begin and when I stop. For Bradley, however, such inferences are never fully satisfactory, since their ground is largely implicit and unknown.

Bradley goes on to consider in some detail other processes such as analysis, abstraction and comparison.  His discussion of arithmetic is of surprising interest, because the construction of the natural number series does seem to make sense of the notion of ideal self-development.  Each natural number develops itself through the successor function to introduce the number that follows it.  The number three is an ideal content, since it is a universal property shared by all triples, so that the transition to four must lie in the domain of ideality.

The representation of space and time is constituted through a similar process involving the ideal self-development of a given space or time.  Although these examples may illuminate the obscure notion of ideal self-development, they will not help to explain inference, if the construction of the successor of a natural number or the space and time that lies beyond what is given is not an inference.  Inference is usually considered a movement of thought from judgment to judgment, from premiss to conclusion.  This is not what happens when we extend a line or form a new number.

b. Judgment

Bradley, however, would not accept this, since he considers judgment itself to be a kind of inference in the wide sense.  It is a kind of inference in which the ground that compels the judgment is not made explicit.  Inference is present, even in the purest case of an analytic judgment of sense.  As we have seen, Bradley recasts the judgment “S is P” in the form: “Reality is such that S is P.”  The word “such” is the placeholder for the ground in reality that compels the conclusion “S is P.”  Since this condition is unspecified and not completely specifiable, the inferential structure is merely implicit.  This is a radical change, under the influence of Bosanquet, from Bradley’s original position, where judgment lies at the interface between the ideal and the actual, between the universal and particular, and is hence distinct from inference which is a movement within thought.

Bradley supports his change of heart by giving an example.  Suppose I immediately experience A to the right of B and therefore form the judgment that A is to the right of B.  There is, presumably, some sort of causal explanation for the relative position of these things.  My objection is that any such condition for the existence of a state of affairs is not a truth condition for the corresponding judgment.  It would be a truth condition only if it were incorporated in the judgment, which it is not.  Even if I am prepared to say that A is to the right of B because John put it there, I am not saying that A is to the right of B, if John put it there.  My statement is categorical, not conditional, and I will insist that A is to the right of B, even if it turns out that John is not responsible.

The objects A and B that are the special subjects of the plural judgment are necessarily selected from and connected with “our whole Universe.” (Presumably, this is our Universe, because it is connected with our immediate experience.)  In a singular judgment the special subject is this reality, which is “some special and emphasized feature in the total mass.” (629)  All such special subjects are conditioned by what lies beyond.  Even without invoking the law of causality, they are all conditioned by their setting in space and time.  Bradley argues that since the special subject of the judgment must be conditioned, even if its conditions are not known, the judgment itself cannot be unconditioned.  “The object therefore remains conditioned by that which is unknown, and only on and subject to this unknown condition is the judgment true.” (631)  This sentence explicitly identifies the existence conditions of the object with the truth conditions of the judgment.  If we refuse to make this jump, we can remain comfortably at Bradley’s “lower point of view” and ignore the obscure and baffling complexities of the esoteric theory.

c. The Fundamental Problem of Thought

Even if we insist on a sharper distinction between judgment and inference than Bradley would allow, there is a general idea of a movement of thought that covers both activities.  There may be some movements of thought we prefer to call judgments and others we call inferences, but Bradley’s purpose is to dig out what all acts of thought have in common.  He believes he can state the fundamental problem without a final distinction between judgment and inference.  Thinking is a process that reaches a result, and this implies the transcending of some initial state.  It is not enough, however, that there be a mere succession of states. The movement of thought requires justification.  The movement of thought must “satisfy the intellect.” In the case of inference, the satisfactory is called “valid”; in the case of judgment, the satisfactory is called “true.”  In both cases the problem of the satisfaction condition is essentially the same.  “Thought demands to go proprio motu…with a ground and reason…. Now to pass from A to B, if the ground remains external, is for thought to pass with no ground at all.”  (Appearance and Reality, Note A, 501)  We might suppose that in the case of deductive inference, there is an internal ground within the domain of ideas, although Bradley would not agree.  But there is clearly no such internal justification for the inferential move in the case of non-deductive or empirical inferences.  The success of empirical inferences or predictions depends on the way the world is or will be.  Our general level of success depends on our living in a reasonably well-ordered world in which we have developed reliable systems for the acquisition of information.

Since the ground that justifies the movement of thought is the nature of reality, this ground can never be brought within thought without the identity of thought and reality.  Nothing less than this will satisfy the intellect.  This is the essentially Hegelian move to identify thought and reality by turning reality into a system of thought.  Not that a finite center can ever reach an unconditioned completion of its thought.  We may try to get as close as we can, and the closer we get to a final completion, the more truth our thought contains.  As we expand our system of thought to make it more comprehensive, the truer it will become, so long as it remains harmonious and coherent.   Although the goal of Thought in Dialectic may be to complete the incomplete, Bradley believes that there is more to reality than even a completed system of thought could provide.  Bradley is not a Hegelian, because he denies that the completion of thought, even if it were possible, would be identical with the Absolute.  He rejects the replacement of reality by “some spectral woof of impalpable abstractions, or unearthly ballet of bloodless categories.” (591)  Although Bradley follows Kant in accepting the transcendental ideality of the series of phenomena, a position that provided a stepping stone for Hegel, Bradley refuses to accept this creation of the mind as the reality encountered in immediate experience.  For Bradley, “it is the whole continuity of the total series which is absolutely based on ideal reconstruction.  By means of this function, and this function alone, we have connected the past in one line with the present.” (587)

d. Immediate Experience and the Absolute

Immediate experience is associated with a cluster of ideas: “this”, “my”, “now”, “here”.  What is immediately experienced is felt.  “Feeling may be either used of the whole mass felt at any one time, or it may again be applied to some element in that whole.” (659)  What I immediately experience is real enough, but this does not mean that everything real must be experienced by me.  As less than reality as a whole, Bradley calls my immediate experience an appearance of reality.  To Bradley, “it seems clear that we not only start from the given ‘this,’ but remain resting on that foundation throughout.  Our whole ordered universe we may call a construction resting on immediate experience.” (661)

Bradley clearly retains the phenomenal realism at the heart of traditional empiricism, while rejecting the idea that immediate experience is a collection of distinct existences, which was responsible for its demise.  Experience, for Bradley, is originally a sensuous, felt mass.  This is particularly acceptable with the re-instatement of mass terms, excluded by the logic of Principia Mathematica.

For Bradley, a collection of distinct existences is not given, but emerges through an analysis carried out by thought.  “I have to turn my experience into a disjunctive totality of elements.” (665) This is uncannily like Quine’s idea that “we persist in breaking reality down somehow into a multiplicity of identifiable and discriminable objects.”  The connection is particularly striking, once we realize that special subjects, as well as Reality as a Whole, may extend beyond what is presented in immediate experience.  The ideal contents, necessary to separate objects within the sensuous felt mass, do not confine these objects to their presentation in immediate experience.  Because the contents are universal, they permit what Hume would call the continued existence of such real things beyond their appearance in my mind.

Bradley’s theory must be taken very seriously because of the detailed account that it offers of a process that Quine leaves shrouded in mystery.  It may be understood as a way of fixing what is wrong with empiricism.  It is harder to sympathize with the arguments that led Bradley to abandon what he calls the “lower point of view” and which may be based on a mistake.

19. References and Further Reading

a. Selected works by F. H. Bradley

  • The Principles of Logic. Oxford University Press, 1883; second revised edition including terminal essays, 1922.
    • (This is the main source for Bradley’s logical theory.)
  • Appearance and Reality. Oxford University Press, 1893; second edition with appendix, 1897.
    • (The metaphysical theory.)
  • Essays on Truth and Reality. Oxford University Press, 1914.
    • (A collection of articles, for the most part originally published in Mind, and many on broadly logical topics.)
  • Collected Works. Thoemmes Press: Bristol, England and Sterling, Va., 1999.
    • (Volume I contains Bradley’s notes for The Principles of Logic.)

b. Further Reading

  • Allard, J. W., 2005, The Logical Foundations of Bradley’s Metaphysics: Judgment, Inference, and Truth. Cambridge University Press.
  • Basile, Pierfrancesco, 1999, Experience and Relations: an Examination of F. H. Bradley’s Conception of Reality.  Chapter 4.
  • Blanshard, Brand, 1939, The Nature of Thought.  Two Volumes.  London: George Allen & Unwin.
    • (Especially, Chapter XIII:  Bradley on Ideas in Logic and in Psychology.)
  • Bosanquet, Bernard, 1885, Knowledge and Reality, A Criticism of Mr. F. H. Bradley’s ‘Principles of Logic’.  London: Kegan Paul, Trench.
  • Bradley, James (ed.), 1996, Philosophy after F. H. Bradley. Bristol: Thoemmes.
  • Bradley Studies, the journal of the Bradley Society, was published from 1995 to 2004.
    • (It has now been succeeded by Collingwood and British Idealist Studies.)
  • Campbell, C. A., 1931, Scepticism and Construction: Bradley’s Sceptical Principle as the Basis of Constructive Philosophy. London: George Allen & Unwin.
  • Campbell, C. A., 1957, On Selfhood and Godhood. London; George Allen & Unwin.
    • (Gifford Lectures delivered at the University of St. Andrews.)
  • Campbell, C. A., 1967, In Defence of Free Will. London: George Allen & Unwin.
    • (Chapter XII.  The Mind‘s Involvement in Objects.  This was originally published in 1962 as a contribution to Theories of the Mind, edited by Jordan M. Scher, published by the Free Press of Glencoe, a division of the Macmillan Company.)
  • Candlish, S., 2007, The Russell/Bradley Dispute and its Significance for Twentieth-Century Philosophy. Basingstoke: Palgrave Macmillan.
  • Ferreira, P., 1999, Bradley and the Structure of Knowledge. Albany: SUNY Press.
  • Ferreira P., 2014, ‘Idealist Logic’ in The Oxford Handbook of British Philosophy in the Nineteenth Century, Oxford: Oxford University Press, pp. 111-132.
  • Hylton, Peter, 1990, Russell, Idealism, and the Emergence of Analytic Philosophy.  Oxford University Press.  Chapter 2.
  • Levine, James, 1998, “The What and the That: Theories of Singular Thought in Bradley, Russell and the Early Wittgenstein” in Appearance Versus Reality: New Essays on Bradley’s Metaphysics. Oxford: Clarendon Press.
  • Mander, W. J. (ed.), 1996, Perspectives on the Logic and Metaphysics of F. H. Bradley. Bristol: St. Augustine’s Press.
  • Mander, W.J., 2008, ‘Bradley’s Logic’ in D. Gabbay and J.H. Woods (eds.) Handbook of the History of Logic.  Volume Four: British Logic in the Nineteenth Century, Elsevier, pp. 663-717.
  • Mander, W., 2011, British Idealism. A History.  Oxford University Press.
  • Manser, A., 1983, Bradley’s Logic.  Oxford University Press.
  • Peacocke, C., 1992. A Study of Concepts. Chapter 3.  Cambridge MA and London: MIT Press.
    • (This entry requires explanation, since Bradley is never mentioned in the book.  Chapter 3 introduces scenarios, which are non-conceptual representational contents.  As general, they qualify as ideal contents in Bradley’s sense.  The positioning of scenarios in reality is therefore a special case of an act of judgment that refers an ideal content to a reality beyond the act.  Peacocke is thus presenting the essence of Bradley’s position in an up-to-date form.)
  • Sprigge, T.L.S., 1993, James and Bradley.  Chicago and La Salle, Illinois: Open Court.  Part II.  Chapters 2 and 3.
  • Wollheim, R., 1959, F. H. Bradley. Harmondsworth: Penguin Books.

 

Author Information

D. L. C. Maclachlan
Email: lorne.maclachlan@gmail.com
Queen’s University
Canada

Natural Deduction

Natural Deduction (ND) is a common name for the class of proof systems composed of simple and self-evident inference rules based upon methods of proof and traditional ways of reasoning that have been applied since antiquity in deductive practice. The first formal ND systems were independently constructed in the 1930s by G. Gentzen and S. Jaśkowski and proposed as an alternative to Hilbert-style axiomatic systems. Gentzen introduced a format of ND particularly useful for  theoretical investigations of the structure of proofs. Jaśkowski instead provided a format of ND more suitable for practical purposes of proof search. Since then many other ND systems were developed of apparently different character.

What is it that makes them all ND systems despite the differences in the selection of rules, construction of proof, and other features? First of all, in contrast to proofs in axiomatic systems, proofs in ND systems are based on the use of assumptions which are freely introduced but discharged under some conditions. Moreover, ND systems use many inference rules of simple character which show how to compose and decompose formulas in proofs. Finally, ND systems allow for the application of different proof-search strategies. Thanks to these features proofs in ND systems tend to be much shorter and easier to construct than in axiomatic or tableau systems. These properties of ND make them one of the most popular ways of teaching logic in elementary courses. In addition to its educational value, ND is also an important tool in proof-theoretical investigations and in the philosophy of meaning (specifically, of the meaning of logical constants). This article focuses on the description of the main types of ND systems and briefly mentions more advanced issues concerning normal proofs and proof-theoretical semantics.

Table of Contents

  1. History of Natural Deduction
    1. Origins
    2. Prehistory
  2. Applications
  3. Demarcation Problem
    1. Wide and Narrow Sense of ND
    2. Criteria of Genuine ND
  4. Rules
  5. Proof Format
    1. Tree Proofs
    2. Linear Proofs
  6. Other Approaches
  7. Rules for Quantifiers
  8. ND for Non-Classical Logics
  9. Normal Proofs
  10. Philosophy of Meaning
  11. References and Further Reading

1. History of Natural Deduction

When dealing with the history of ND, one should distinguish between the exact date when the first formal systems of ND were presented and much earlier times when the rules of ND were actually applied. Although one may claim that ND techniques were used as early as people did reasoning, it is unquestionable that the exact formulation of ND and the justification of its correctness was postponed until the 20th century.

a. Origins

The first ND systems were developed independently by Gerhard Gentzen and Stanisław Jaśkowski and presented in papers published in 1934 (Gentzen 1934, Jaśkowski 1934). Both approaches, although different in many respects, provided the realization of the same basic idea: formally correct systematization of traditional means of proving theorems in mathematics, science and ordinary discourse. It was a reaction to the artificiality of formalization of proofs in axiomatic systems. Hilbert’s proof theory offered high standards of precise formulation of this notion, but formal axiomatic proofs were really different than ‘real’ proofs offered by mathematicians. The process of actual deduction in axiomatic systems is usually complicated and needs a lot of invention. Moreover, real proofs are usually lengthy, hard to decipher and far from informal arguments provided by mathematicians. In informal proofs, techniques such as conditional proof, indirect proof or proof by cases are commonly used; all are based on the introduction of arbitrary, temporarily accepted assumptions. Hence the goals of Gentzen and Jaśkowski were twofold: (1) theoretical and formally correct justification of traditional proof methods, and (2) providing a system which supports actual proof search. Moreover, Gentzen’s approach provided the programme for proof analysis which strongly influenced modern proof theory and philosophical research on theories of meaning.

b. Prehistory

According to some authors the roots of ND may be traced back to Ancient Greece. Corcoran (1972) proposed an interpretation of Aristotle’s syllogistics in terms of inference rules and proofs from assumptions. One can also look for the genesis of ND system in Stoic logic, where many researchers (for example, Mates 1953) identify a practical application of the Deduction Theorem (DT). But all these examples, even if we agree with the arguments of historians of logic, are only examples of using some proof techniques. There is no evidence of theoretical interest in their justification.

In fact the introduction of DT into the realm of modern logic seems to be one of the most important steps on the way leading eventually to the discovery of ND. Although Herbrand did not present a formal proof of it for axiomatic systems until Herbrand (1930), he had already stated it in Herbrand (1928). At the same time Tarski (1930) included DT as one of the axioms of his Consequence Theory; in practice he had used it since 1921. Also other ND-like rules were practically applied in the 1920s by many logicians from the Lvov-Warsaw School, like Leśniewski and Salamucha, as is evident from their papers.

Jaśkowski was strongly influenced by Łukasiewicz, who posed on his Warsaw seminar in 1926 the following problem: how to describe, in a formally proper way, proof methods applied in practice by mathematicians. In response to this challenge Jaśkowski presented his first formulation of ND in 1927, at the First Polish Mathematical Congress in Lvov, mentioned in the Proceedings (Jaśkowski 1929). A final solution was delayed until (Jaśkowski 1934) because Jaśkowski had a lengthy break in his research due to illness and family problems. Gentzen also published the first part of his famous paper in 1934, but the first results are present in (Gentzen 1932). This early paper, however, is concerned not with ND but with the first form of Sequent Calculus (SC). Gentzen was influenced by Hertz (1929), where a tree-format notation for proofs, as well as the notion of a sequent, were introduced. One can also look for a source of the shape of his rules in Heyting’s axiomatization of intuitionistic logic (see von Plato 2014).

It should be no surprise that the two logicians with no knowledge of each other’s work, independently proposed quite different solutions to the same problem. Axiom systems, although theoretically satisfying, were considered by many researchers as practically inadequate and artificial. Thus the need for more practice-oriented deduction systems was in the air.

2. Applications

This article distinguishes at least three main fields of application of ND systems: practical, theoretical and philosophical.

Since 1934 a lot of systems called ND were offered by many authors in numerous textbooks on elementary logic. In this way ND systems became a standard tool of working logicians, mathematicians, and philosophers. At least in the Anglo-American tradition, ND systems prevail in teaching logic. They also had strong influence on the development of other types of non-axiomatic formal systems such as sequent calculi and tableau systems. In fact, the former were also invented by Gentzen as a theoretical tool for investigations on the properties of ND proofs, whereas the latter may be seen (at least in the case of classical logic) as a further simplification of sequent calculus that is easier for practical applications.

But the importance of ND is not only of practical character. Since 1960s the works of Prawitz (1965) and (Raggio 1965) on normal proofs opened up the theoretical perspective in the applications of ND. In fact Prawitz was rediscovering things known to Gentzen but not published by him, which was later shown by von Plato (2008). In addition to extended work on normalization of proofs, ND is also an interesting tool for investigations in theoretical computer science through the Curry-Howard isomorphism. This approach shows that (normal) ND proofs may be interpreted in terms of executions of programs.

Finally the special form of rules of ND provided by Gentzen led to extensive studies on the meaning of logical constants. This article takes a look at theoretical and philosophical applications of ND in sections 9 and 10.

3. Demarcation Problem

The great richness of different forms of systems called ND leads to some theoretical problems concerning the precise meaning of the term ‘ND’. It seems that no definition of ND systems was offered which would be generally accepted. This demarcation problem was investigated by many authors; and different criteria were offered for establishing what is, and what is not, an ND system. Detailed survey of these matters may be found in Pelletier (1999) or in Pelletier and Hazen (2012); this article points out only the most important features.

a. Wide and Narrow Sense of ND

Some authors tend to use the term in a broad sense in which it covers almost all that is not an axiomatic system in Hilbert’s sense. Hence sometimes systems like sequent calculi or tableau calculi are treated as ND systems. All these systems are actually in close relationship, but this article chooses to consider ND only in the narrow sense. There are at least three reasons for making this choice:

  • Historical. Original ideas of Gentzen, who introduced two systems: NK (Natürliche Kalkül) and LK (Logistiche Kalkül). The former is just an ND system, whereas the latter, a sequent calculus, was meant as a technical tool for proving some metatheorems on NK, not as a kind of ND.
  • Etymological. ND is supposed to reconstruct, in a formally proper way, traditional ways of reasoning. It is disputable whether existing ND systems realize this task in a satisfying way, but certainly systems like tableaux or SC are even worse in this respect.
  • Practical. Taking the term ND in a wide sense would be a classifying operation of doubtful usefulness. From the point of view of this article’s presentation, it is more convenient to use a more narrowly defined concept.

b. Criteria of Genuine ND

But what criteria should be used for delimiting the class of systems called ND? Many proposals seem to be too narrow (that is, strict) since they exclude some systems usually treated as ND, so it is better not to be very demanding in this respect. So, ND system should satisfy three criteria:

  1. Possibility of entering and eliminating (discharging) additional assumptions during the course of the proof. Usually it requires some bookkeeping devices for indicating the scope of an assumption, that is, for showing that a part of the proof (a subproof) depends on a temporary assumption, and for marking the end of such a subproof the point at which the assumption is “discharged”.
  2. Characterization of logical constants by means of rules rather than axioms. Their role is taken over by the set of primitive rules for introduction and elimination of logical constants, which means that elementary inferences instead of formulas are taken as primitive.
  3. The richness of forms of proof construction. Genuine ND systems admit a lot of freedom in proof construction and in the possibility of applying several strategies of proof-search.

These three conditions seem to be the essential features of any ND. These characteristics are quite general, but the third at least serves to exclude tableau systems and sequent calculi since genuine ND should allow both direct and indirect proofs, proofs by cases, and so forth. This flexibility of proof construction is vital for ND, whereas, for example in a standard tableau system, we have only indirect proofs and elimination rules. On the other hand, ND does not require that its rules should strictly realise the schema of providing a pair of introduction and elimination rules, and that axioms are not allowed.

4. Rules

ND systems consist of the set of (schemata) of simple rules characterising logical constants. For example a connective of conjunction \wedge is characterised by means of the following rules:

    \[\begin{array}{ccc}(\wedge I)\ \ \dfrac{\varphi, \psi}{\varphi\wedge\psi}\quad&(\wedge E)\ \ \dfrac{\varphi\wedge\psi}{\varphi}\quad&(\wedge E)\ \ \dfrac{\varphi\wedge\psi}{\psi}\end{array}\]

where \varphi and \psi denote any formulas. Material above the horizontal line represents the premises; and that below represents the conclusion of the inference. The letters I and E in the names of the rules come from “introduction” and “elimination” respectively since the first allows introduction of a conjunction into a proof, and the second allows for its elimination in favor of simpler formulas. Often the following horizontal notation is applied (instead of vertical which is more space-consuming):

    \begin{gather*} \((\wedge E)\varphi \wedge \psi \vdash \varphi \,and\, \varphi \wedge \psi \vdash \psi\\ (\wedge I)\varphi , \psi \vdash \varphi \wedge \psi \end\) \end{gather*}

Here \vdash is used to point out that the relation of deducibility holds between premises and the conclusion of a rule instance. In what follows, such phrases are called sequents. In fact such deducibility statements in general do not uniquely characterise inference rules, but it does no harm so they are used in what follows for simplicity’s sake.

One can easily check that the rules stated above adequately characterise the meaning of classical conjunction which is true iff both conjuncts are true. Hence the syntactic deducibility relation coincides with the semantic relation of \models, that is, of logical consequence (or entailment). Unfortunately not all logical constants may be characterised by means of such simple rules. For example, implication \rightarrow in addition to modus ponens (or detachment rule):

    \[(\rightarrow E)\varphi \rightarrow \psi , \varphi \vdash \psi\]

which is known from axiomatic systems, requires a more complex rule (\rightarrow I) of the shape:

    \[\begin{array}{cc}& [\varphi] \\ & \vdots \\ \Gamma\ & \psi \\ \hline & \varphi \rightarrow \psi\end{array}\]

or:

(\rightarrow I) If \Gamma , \varphi\vdash\psi then \Gamma\vdash\varphi\rightarrow\psi

 
where \Gamma and \varphi forms a collection of all active assumptions previously introduced which could have been used in the deduction of \psi. When inferring \varphi\rightarrow\psi, one is allowed to discharge assumptions of the form \varphi. The fact that after deduction of \varphi\rightarrow\psi this assumption is discharged (not active) is pointed out by using [ ] in vertical notation, and by deletion from the set of assumptions in horizontal notation. The latter notation shows better the character of the rule; one deduction is transformed into the other. It shows also that the rule (\rightarrow I) corresponds to an important metatheorem, the Deduction Theorem, which has to be proved in axiomatic formalizations of logic. In what follows, all rules of the shape \Gamma\vdash\varphi will be called inference rules, since they allow for inferring a formula (conclusion) from other formulas (premises) present in the proof. Rules of the form:

If \Gamma_1 \vdash \varphi_1, \ldots, \Gamma_n \vdash \varphi_n, then \Gamma \vdash \varphi

 
will be called proof construction rules since they allow for constructing a proof on the basis of some proofs already completed. One characteristic feature of such rules is that they involve the process of entering new assumptions as well as conditions under which one can discharge these assumptions and close subordinated proofs (or subproofs) starting with these assumptions.

The complete set of rules provided by Gentzen for IPL (Intuitionistic Propositional Logic) is the following:

    \[\begin{array}{ll} (\bot E) & \bot \vdash \varphi \\ (\neg E) & \varphi , \neg \varphi \vdash \bot \\ (\neg I) & \text{If } \Gamma , \varphi \vdash \bot \text{, then } \Gamma \vdash \neg \varphi \\ (\wedge I) & \varphi , \psi \vdash \varphi \wedge \psi \\ (\wedge E) & \varphi \wedge \psi \vdash \varphi \text{ and }\varphi \wedge \psi \vdash \psi \\ (\rightarrow E) & \varphi , \varphi \rightarrow \psi \vdash \psi \\ (\rightarrow I) & \text{If }\Gamma , \varphi \vdash \psi \text{, then }\Gamma \vdash \varphi \rightarrow \psi \\ (\vee I) & \varphi \vdash \varphi \vee \psi \text{ and }\psi \vdash \varphi \vee \psi \\ (\vee E) & \text{If }\Gamma , \varphi \vdash \chi \text{ and }\Delta , \psi \vdash \chi \text{, then }\Gamma , \Delta , \varphi \vee \psi \vdash \chi\end{array}\]

What is evident from this set of rules is the Gentzen policy of characterising every constant by a pair of rules, in which one is the rule for introduction a formula with that constant into a proof, and the other is the rule of elimination of such a formula, that is, inferring some simpler consequences from it, sometimes with the aid of other premises. More will be said about philosophical consequences of this approach in section 10.

In order to obtain CPL (Classical Propositional Logic), Gentzen added the Law of Excluded Middle \neg \varphi \vee \varphi as an axiom, but the same result can easily be obtained by a suitable inference rule of double negation elimination: \neg \neg \varphi \vdash \varphi or by changing one of the proof construction rules, namely (\neg I) which encodes the weak form of indirect proof into the strong form:

(\neg E) If \Gamma , \neg \varphi \vdash \bot, then \Gamma \vdash \varphi

 
This solution was applied by Jaśkowski (1934).

5. Proof Format

In addition to providing suitable rules, one must also decide about the form of a proof. Two basic approaches due to Gentzen and Jaśkowski are based on using trees as a representation of a proof and on using linear sequences of formulas. This article focuses on the most important differences between these two approaches. For detailed comparison see Pelletier and Hazen (2014), and Restall (2014).

a. Tree Proofs

Let us start with an example of a proof in Gentzen’s format, that is, as a tree of formulas:

    \[\begin{array}{cl} \underline{[p]^1\hspace{.5cm} [p\rightarrow q]^3}\hspace{2cm} & ass. \\ \underline{q \hspace{2cm} [q \rightarrow r]^2} & (\rightarrow E) \\ \underline{\hspace{1cm}r\hspace{1cm}} & (\rightarrow E) \\ \underline{\hspace{1cm}p \rightarrow r^1\hspace{1cm}} & (\rightarrow I) \\ \underline{\hspace{.5cm}(q \rightarrow r)\rightarrow (p\rightarrow r)^2\hspace{.5cm}} & (\rightarrow I) \\ (p\rightarrow q)\rightarrow ((q \rightarrow r)\rightarrow (p\rightarrow r))^3 & (\rightarrow I) \end{array}\]

Here the root of a tree is labelled with a thesis and its leaves are labelled with (discharged) assumptions: p\rightarrow q, q\rightarrow r and p. All assumptions were discharged while (\rightarrow I) was applied successively building implications from r — the numbers of assumptions indicate the order in which they were discharged, and the suitable number is attached to the formula inferred by the assumption discharging rule. Before that, r was deduced by two applications of (\rightarrow E), first to two assumptions (active at this moment), then to the third assumption and previously deduced q.

Gentzen’s tree format of representing proofs has many advantages. It is an excellent representation of real proofs; in particular, deductive dependencies between formulas are directly shown. But if we are concerned with actual deduction, this format of proof is far from being useful and natural. Moreover, one is often forced to repeat identical, or very similar, parts of the proof, since, in tree format, inferences are conducted not on formulas but on their particular occurrences. For example, if \varphi \wedge \psi is an assumption from which we need to infer both \varphi and \psi, then a suitable branch starting with \varphi \wedge \psi must be displayed twice. The following example illustrates the point:

    \[\begin{array}{cl} \hspace{.5cm}\underline{[p\wedge (q \wedge p \rightarrow r)]^2}\hspace{2.5cm} & ass.\\ \underline{[q]^1\hspace{1cm} p}\hspace{2cm}\underline{[p\wedge (q \wedge p \rightarrow r)]^3} & (\wedge E) \\ \underline{q \wedge p \hspace{3cm} q \wedge p \rightarrow r} & (\wedge I), (\wedge E) \\ \underline{\hspace{.5cm}r\hspace{.5cm}} & (\rightarrow E) \\ \underline{\hspace{.5cm}q \rightarrow r^1\hspace{.5cm}} & (\rightarrow I) \\ (p\wedge (q \wedge p \rightarrow r))\rightarrow (q\rightarrow r)^{2, 3} & (\rightarrow I)\end{array}\]

here, the attachment of two numerals 2, 3 to the formula in the last line indicates that both occurrences of the same assumption were discharged in this step.

Gentzen himself was aware of the disadvantages of his representation of proof, but it proved useful for his theoretical interests described in section 9. It is not surprising that the tree format of proofs is mainly used in theoretical studies on ND, as in Prawitz (1965) or Negri and von Plato (2001).

b. Linear Proofs

Jaśkowski, on the other hand, preferred a linear representation of proofs since he was interested in creating a practical tool for deduction. Linear format has many virtues over Gentzen’s approach. For example, inferences are drawn from assumptions rather than from their occurrences, which means that, for example, one needs to assume \varphi \wedge \psi only once to derive both conjuncts. It is also more natural to construct a linear sequence trying, one by one, each possible application of the rules. But there is a price to be paid for these simplifications—the problem of subordinated proofs. How should we represent that some assumption and its subordinated proof are no longer alive because a suitable proof construction rule was applied? If we apply a proof construction rule which discharges an assumption, we must explicitly show that the subordinate proof dependent on this assumption is dead in the sense that no formula from it may be used below in the proof. In a tree format this is not a problem—to use a formula as a premise for the application of some inference rule we must display it (and the whole subtree which provides a justification for it) directly above the conclusion. In linear format this leads to problems, and some technical devices are necessary which forbid using the assumptions and other formulas inferred inside completed subproofs. Jaśkowski proposed two solutions to this problem: graphical (boxes) and bookkeeping (in the terminology of Pelletier and Hazen 2012). Let us compare these two simple proofs:

DN-IEP-2-dia-1

 

On the left we have an example of a proof in graphical mode where each assumption opens a new box in which the rest of the proof is carried out. On the other hand when a suitable proof construction rule is applied, the current subproof is boxed which means that nothing inside is allowed in further proof construction. In lines 3 and 5 an additional rule of repetition (often called reiteration) is applied which allows for moving formulas from outer to inner boxes. On the right the same proof is represented in bookkeeping style where instead of boxes we use prefixes (sequences of natural numbers) for indicating the scope of an assumption. Each assumption is preceded with the letter S from latin suppositio and adds a new numeral to the sequence of natural numbers in the prefix. When a proof construction rule is applied, the last item is subtracted from the prefix. Hence a thesis can occur with an empty sequence, signifying that it does not depend on any assumption. No repetition rules are applied in this version of Jaśkowski’s system; hence the proof is two lines shorter.

Although Jaśkowski finally chose the second option (perhaps due to editorial problems) nowadays the graphical approach is far more popular, probably due to the great success of Fitch’s textbook (1952) which popularized a simplified version of Jaśkowski’s system (now called Fitch’s approach). In Fitch’s system one is using vertical lines for indicating subproofs. Below is an example of a proof in Fitch’s format:

DN-IEP-2-dia-2

 

Other devices were also applied such as brackets in Copi (1954), or even just indentation of subordinate proofs. The original Jaśkowski’s boxes were used by Kalish and Montague (1964) with the additional device being of great heuristic value; each box is preceded by a show-line which displays the current aim of the proof. Show-lines are not parts of a proof in the sense that one is forbidden to use them as premises for rule application. But after completing a subproof, a box is closed and the opening show-line becomes a new ordinary line in the proof (which is pointed out by deleting a prefix “show”).

The second solution of Jaśkowski was not so popular. One can mention here Quine’s system (1950) (with asterisks instead of numerals) or Słupecki and Borkowski’s system (1958) popular in Poland.

6. Other Approaches

Gentzen (1936) introduced yet another variant of ND which may be considered as lying between his first system described in subsection 5.1. and his famous sequent calculus. It shows another possible way of arranging the bookkeeping of active assumptions. As a result, in this approach the basic items which are transformed in proofs are not formulas but rather sequents. For example, both rules for conjunction are of the form:

    \[\begin{array}{ll} (\wedge I') & \text{If }\Gamma \vdash \varphi \text{ and } \Delta \vdash \psi\text{, then } \Gamma, \Delta \vdash \varphi \wedge \psi \\ (\wedge E') & \text{If } \Gamma \vdash \varphi \wedge \psi\text{, then } \Gamma \vdash \varphi \text{;} \ \ \text{If } \Gamma \vdash \varphi \wedge \psi\text{, then } \Gamma \vdash \psi \end{array}\]

where \Gamma , \Delta are records of active assumptions.

The full list of rules for CPL contains also:

    \[\begin{array}{ll} (\neg E') & \text{If } \Gamma, \varphi \vdash \psi \text{ and } \Delta, \varphi \vdash \neg\psi\text{, then } \Gamma, \Delta \vdash \psi \\ (\neg I') & \text{If } \Gamma \vdash \neg\neg\varphi\text{, then } \Gamma \vdash \varphi \\ (\rightarrow E') & \text{If } \Gamma \vdash \varphi \text{ and } \Delta \vdash \varphi \rightarrow \psi\text{, then } \Gamma, \Delta \vdash \psi \\ (\rightarrow I') & \text{If } \Gamma , \varphi \vdash \psi\text{, then } \Gamma \vdash \varphi \rightarrow \psi \\ (\vee I') & \text{If } \Gamma \vdash \varphi\text{, then } \Gamma \vdash \varphi \vee \psi\text{;} \ \ \text{If } \Gamma \vdash \psi\text{, then } \Gamma \vdash \varphi \vee \psi \\ (\vee E') & \text{If } \Gamma \vdash \varphi \vee \psi \text{ and } \Delta, \varphi \vdash \chi \text{ and } \Lambda , \psi \vdash \chi\text{, then } \Gamma , \Delta , \Lambda \vdash \chi \end{array}\]

Assumptions are sequents of the form \varphi \vdash \varphi. Theses are sequents with an empty antecedent. Here is an example of a proof:

    \[\begin{array}{c} \underline{p \vdash p\hspace{1cm} p\rightarrow q \vdash p\rightarrow q}\hspace{4cm} \\ \underline{p, p\rightarrow q\vdash q \hspace{3cm} q \rightarrow r\vdash q\rightarrow r} \\ \underline{p, p\rightarrow q, q\rightarrow r \vdash r} \\ \underline{p\rightarrow q, q\rightarrow r \vdash p \rightarrow r} \\ \underline{p\rightarrow q \vdash (q \rightarrow r)\rightarrow(p\rightarrow r)} \\ \vdash (p\rightarrow q)\rightarrow ((q\rightarrow r)\rightarrow(p\rightarrow r)) \end{array}\]

One can observe that in the context of such a system the difference between inference and proof construction rules disappears. The only difference is that in the former all transformations are performed on consequents of sequents whereas in the latter some operations (that is, subtractions) are allowed also on antecedents. This is the difference with Gentzen’s ordinary sequent calculus where we have rules introducing constants to antecedents of sequents (instead of rules of elimination). Of course one can go further and allow this kind of rule as well (such a system was constructed, for example, by Hermes 1963), but it seems that Gentzen’s choice offers significant simplifications. First of all, the tree format is not necessary, and one can display proofs as linear sequences since the record of active assumptions is kept with every formula in a proof (as the antecedent). Moreover, since no operation except subtraction is carried out on antecedents, we can get rid of formulas in antecedents and use instead numerals of lines where suitable assumptions were introduced into proofs. Both simplifications are present in Suppes’ system (1957) of ND where the same proof looks like that:

    \[\begin{array}{lcll} 1 & \{1\} & p\rightarrow q & \text{ass.} \\ 2 & \{2\} & q\rightarrow r & \text{ass.} \\ 3 & \{3\} & p & \text{ass.} \\ 4 & \{1, 3\} & q & 1, 3, \ (\rightarrow E) \\ 5 & \{1, 2, 3\} & r & 2, 4, \ (\rightarrow E) \\ 6 & \{1, 2\} & p\rightarrow r & 5, \ (\rightarrow I) \\ 7 & \{1\} & (q \rightarrow r)\rightarrow (p\rightarrow r) & 6, \ (\rightarrow I) \\ 8 & \varnothing & (p\rightarrow q)\rightarrow ((q \rightarrow r)\rightarrow (p\rightarrow r)) & 7, \ (\rightarrow I) \end{array}\]

Other solutions generalising standard proof representations were also considered. One can mention at least two approaches without going into details: ND operating on clauses instead of formulas (Borićić 1985, Cellucci 1992, Indrzejczak 2010)  and ND admitting subproofs as items in the proof (Fitch 1966, Schroeder-Heister 1984).

7. Rules for Quantifiers

Gentzen (1934) also provided the first set of ND rules adequate for CFOL (Classical First-Order Logic) whereas the rules of Jaśkowski’s system characterised the weaker system of IFOL (Inclusive First-Order Logic) which admits empty domains in models. As pointed out by Bencivenga (2014), a minimal relaxation of Jaśkowski’s rules yields also Free Logic, that is, a logic allowing non-denoting terms, hence it may be claimed that it is the first formalization of Universally Free Logic, that is, allowing both empty domains and non-denoting terms.

Before characterising Gentzen’s original rules for quantifiers let us note that he was using two sorts of symbols to distinguish between free and bound individual variables. The former are often called individual parameters. Such a solution simplifies a formulation of rules and eliminates the risk of a clash of variables while applying the rules. When we provide ND rules for more standard approaches with just individual variables which may have free or bound occurrences, we must be careful to define precisely the operation of proper substitution of a term for all free occurrences of a variable. ‘Proper’ means that no occurrence of a free variable substituted for another (or, when function-symbols are used, within a term substituted for a variable) gets bound by a quantifier. For simplicity’s sake we will keep Gentzen’s solution; let x, y, z denote (bound) variables and a, b, c free variables or individual parameters. Gentzen’s rules are the following:

    \[\begin{array}{ll} (\forall E) & \forall x\varphi \vdash \varphi[x/a] \\ (\exists I) & \varphi [x/a] \vdash \exists x\varphi \\ (\forall I) & \text{If }\Gamma \vdash \varphi [x/a] \text{, then }\Gamma \vdash \forall x\varphi \\ (\exists E) & \text{If }\Gamma \vdash \exists x\varphi\text{ and } \Delta, \varphi[x/a] \vdash \psi\text{, then }\Gamma, \Delta \vdash \psi \end{array}\]

where \varphi [x/a] denotes the operation of substitution, that is, of replacing all free occurrences of x in \varphi with a parameter a. In case of (\forall I) and (\exists E) a parameter a is required to be “fresh” in the sense of having no other occurrences in \Gamma , \Delta, \varphi, \psi. Such a fresh a is sometimes called an ‘eigenvariable’ or a ‘proper variable’.

The last rule in Gentzen’s tree format looks as follows:

    \[\begin{array}{crc} \Gamma & & [\varphi[x/a]], \Delta \\ \vdots & & \vdots \\ \exists x\varphi & & \psi \\ \hline & \psi & \end{array}\]

Although Gentzen provided this set of rules for his tree-system of ND, it was easily adapted also to linear systems based on Jaśkowski’s (or Suppes’) format of proof. Let us illustrate their application in Fitch’s proof format (but not with his original rules):

DN-IEP-2-dia-3

 

The first application of (\forall E) introduces a parameter a in place of x. In line 3 and 7 the assumptions for the applications of (\exists E) in line 5 and 10 respectively are introduced, each time with a new eigenparameter in place of y. Note that both applications of (\exists E) are correct since neither b nor c are present in the formulas ending suitable subproofs. Also the application of (\forall I) in line 6 is correct since a is not present in line 1.

The fact that (\forall I) is a proof construction rule is obscured here since there is no need to introduce a subproof by means of a new assumption. We just require that in order to apply (\forall I) there be no occurrence of an involved parameter (here a) in active assumptions. However, there are systems of ND where such a subproof (usually flagged with a fresh parameter which will be universally quantified below) is explicitly introduced into a proof. For instance, the original Fitch’s rule is based on such a solution; in fact it follows closely the original Jaśkowski’s rule for inclusive general quantifier.

Gentzen’s (\exists E) was sometimes considered as complex and artificial, and some inference rules were proposed instead where \varphi[x/a] is directly inferred and not assumed. Although the idea is simple its correct implementation leads to troubles. Carefull formulations of such a rule (as in Quine 1950) are correct but hard to follow; simple formulations (as in several editions of Copi 1954) make the system unsound. For a detailed analysis of the relations between Gentzen-style and Quine-style quantifier rules one should consult Fine (1985), Hazen (1987) and Pelletier (1999). All these problems with providing correct and simple rules for quantifiers led some authors to doubt if it is really possible (see Anellis 1991). It seems that the only correct system of ND for CFOL with ‘really’ simple rule of this kind is in Kalish and Montague (1964), but this is rather a side-effect of the overall architecture of the system which is not discussed here (but see a detailed explanation of the virtues of Kalish and Montague’s system in Indrzejczak 2010).

8. ND for Non-Classical Logics

ND systems were also offered for many important non-classical logics. In particular, Jaśkowski’s graphical approach is very handy in this field due to the machinery of isolated subproofs. It appeared that for many non-classical logics one can obtain a satisfying result by putting restrictions on the rule of repetition in the case of some subproofs. Let us take as an example the ND formalization of well known propositional modal logic T; for simplicity we restrict considerations to rules for \Box (necessity). (\Box E) is obvious: \Box \varphi \vdash \varphi. With (\Box I) the situation is more complicated since it is based on the following principle:

If \varphi_1, ..., \varphi_n \vdash \psi, then \Box\varphi_1, ..., \Box\varphi_n \vdash \Box\psi

 
where formulas in the antecedent are also being changed by addition of \Box. It is realised by means of a special ‘modal’ subproof which is opened with no assumption, but no other formulas may be put in it except those which were preceded by \Box in outer subproofs (and with \Box deleted after transition). If in such modal subproof we deduce \psi, it can be closed and \Box\psi can be put into the outer subproof. The following proof in Fitch’s style illustrates this:

DN-IEP-2-dia-4

 

In line 4 a modal subproof was initiated which is shown by putting a sole \Box in place of the assumption. Lines 5 and 6 result from the application of modal repetition. Such an approach may be easily extended to other modal logics by modifying conditions of modal repetition; for example, for S4 it is enough to admit that formulas with \Box (no deletion) also may be repeated; for S5, formulas with negated \Box are also allowed. Such an approach to modal logics was initiated by Fitch (1952), extensive study of such systems can be found in Fitting (1983), Garson (2006) and Indrzejczak (2010) where also some other approaches are discussed.

This modus of formalizing logics in ND was also applied for other non-classical logics including conditional logics (Thomason 1970), temporal logics (Indrzejczak 1994) and relevant logics (Anderson and Belnap 1975). In the latter the technique of restricted repetition is not enough however (and even not required for some logics of this kind). Far more important is the technique of labeling all formulas with sets of numbers annotating active assumptions which is necessary for keeping track of relevance conditions. Subsequently, applications of labels of different kinds is in fact one of the most popular technique used not only in tableau methods but also in ND. Vigano (2000) provides a good survey of this approach.

9. Normal Proofs

When constructing proofs one can easily make some inferences which are unnecessary for obtaining a goal. Gentzen was interested not only in providing an adequate system of ND but also in showing that everything which may be proved in such a system may be proved in the most straightforward way. As he put it, in such a proof “No concepts enter into the proof other than those contained in its final result, and their use was therefore essential to the achievement of the result’’ (Gentzen 1934).

In particular, such unnecessary moves are performed if one first applies some introduction rule for logical constant c and then uses the conclusion of this rule application as a premise for the application of the elimination rule for c. In such cases the final conclusion is either already present in the proof (as one of the premises of respective introduction rule) or may be directly deduced from premises of the application of introduction rule. For example, if one is deducing \varphi\rightarrow\psi on the basis of (\rightarrow I) and then by (\rightarrow E) is deducing \psi from this implication and \varphi, then it is simpler to deduce \psi directly from \varphi; the existence of such a proof is guaranteed because it is a subproof introducing \varphi\rightarrow\psi. Let us call a maximal formula any formula which is at the same time the conclusion of an introduction rule and the main premise of an elimination rule. A proof is called normal iff no maximal formula is present in it. Roughly speaking we can obtain such a proof if first we apply elimination rules to our assumptions (premises) and then introduction rules to obtain the conclusion. Such proofs are analytic in the sense of having the subformula property: all formulas occurring in such a proof are subformulas or negations of subformulas of the conclusion or premises (undischarged assumptions).

Although the idea of a normal proof is rather simple to grasp it is not so simple to show that everything provable in ND system may have a normal proof. In fact for many ND systems (especially for many non-classical logics) such a result does not hold. Gentzen proved such a result directly for an ND system for Intuitionistic Logic, but he was unable to provide a proof for his ND for Classical Logic. He failed to provide the proof for the Intuitionistic case and instead he provided the result for both his ND systems indirectly. First he introduced an auxiliary technical system of sequent calculus and proved for it (both in the classical and intuitionistic cases) the famous Cut-Elimination Theorem. Then he showed that this result implies the existence of a normal proof for every thesis and valid argument provable in his ND systems. Such a result is usually called the Normal Form Theorem whereas the stronger result showing directly how to transform every ND-proof into normal proof by means of a systematic procedure is called the Normalization Theorem. That Gentzen indeed proved the Normalization Theorem for Intuitionistic case became known recently due to von Plato (2008) who found a preliminary draft of Gentzen’s thesis. The first published versions of proofs of Normalization theorems appeared in the 1960s due to Raggio (1965) and Prawitz (1965) who proved this result also for ND systems for some non-classical logics. For a detailed account of these problems see Troelstra and Schwichtenberg (1996) or Negri and von Plato (2001).

One thing should be noticed with respect to proofs in normal form. Although normal proofs are in a sense the most direct proofs, this does not mean that they are the most economical. In fact, non-normal proofs often may be shorter and easier to understand than normal ones. Perhaps it is simpler to understand if we recall that normalization in ND is the counterpart of cut-elimination in sequent calculi. Applications of cuts in proofs correspond to applications of previously proved things as lemmas and may drastically shorten proofs. When a proof is normalized, its size may grow exponentially (see, for example, Boolos 1984, Fitting 1996, D’Agostino 1999). What is important in normal proofs is that, due to their conceptual simplicity, they provide a proof theoretical justification of deduction and a new way of understanding the meaning of logical constants.

10. Philosophy of Meaning

Aesthetics was not the only reason for insisting on having both introduction and elimination rules for every constant in Gentzen’s ND. He also wanted to realise a deeper philosophical intuition concerning the meaning of logical constants. It is claimed that if a set of rules is intuitive and sufficient for adequate characterisation of a constant, then it in fact expresses our way of understanding this constant. Moreover, such an approach may be connected with Wittgenstein’s program of characterization of meaning by means of the use of words. In this particular case the meaning of logical constants is characterised by their use (via rules) in proof construction. There is also a strong connection with anti-realistic position in the philosophy of meaning where it is claimed that the notion of truth may be successfully replaced with the notion of a proof (Dummett 1991). One recent, and very strong, version of this trend is represented in Brandom’s (2000) program of strong inferentialism, where it is postulated that the meanings of all expressions may be characterised by means of their use in widely understood reasoning processes. However, inferentialism is not particularly connected with ND nor with the specific shapes of rules as giving rise to the meaning of logical constants.

Leaving aside the far-reaching program of inferentialism, one can quite reasonably ask whether the characteristic rules of logical constants may be treated as definitions. The term ‘Proof-Theoretic Semantics’ first appeared in 1991 (Schroeder-Heister 1991), but the roots of this idea is certainly linked with Gentzen (1934). He himself preferred introduction rules as a kind of definition of a constant. Elimination rules are just consequences of these ‘definitions’, not in the sense of being deducible from them but in the sense that their application is a kind of inversion of introduction rules. The notion of inversion was precisely characterised by Prawitz’s principle of inversion [see Prawitz’s (1965)]: if by the application of elimination rule r we obtain \varphi, then proofs sufficient for deduction of premises of r already contain a deduction of \varphi. Hence one can directly obtain \varphi on the basis of these proofs with no application of r. As these sufficient conditions for deductions of premises are characterised by introduction rules, we can easily see that the inversion principle is strongly connected with the possibility of proving normalization theorems; it justifies making reduction steps for maximal formulas in normalization procedures.

Not all authors dealing with proof-theoretic semantics followed Gentzen in his particular solutions. Popper (1947) was the first who tried to construct deductive systems in which all rules for a constant were treated together as its definition. There are also approaches (such as Dummett 1991, chapter 13, and Prawitz 1971) in which elimination rules are treated as the most fundamental. No matter which kind of rules should be taken as basic for characterization of logical constants, it is obvious that not any set of rules may be treated as a candidate for definition. Prior (1960) paid attention to this fact by means of his famous example. Let us consider a connective “tonk’’ characterised by the following rules:

(tonk I) \varphi \vdash \varphi tonk \psi
(tonk E) \varphi tonk \psi \vdash \psi

 
One can easily show that any formula is deducible from any formula after adding such rules to ND system. However Prior’s example only showed that one should carefuly characterise conditions of correctness for rules which are proposed as a tool for characterisation of logical constants. One of the first proposals is due to Belnap (1962) who emphasized that, just as for definitions, rules must be noncreative in the sense that if we add them to some ND system, then we obtain its conservative extension. In other words, if some formula with no occurrence of this new constant was not deducible in the ‘old’ system, then it is still not in the extended system. Rules for “tonk’’ do not satisfy this requirement. Although Belnap’s solution is not sufficient, he opened the door for further research of such conditions. The term “(proof-theoretic) harmony’’ is widely used for specification of such adequacy conditions for rules, and there is a large amount of literature concerned with this question. Schroeder-Heister (2014) provides one of the recent solutions to this problem whereas Schroeder-Heister (2012) offers extensive discussion of other approaches.

11. References and Further Reading

  • [1] Anderson, A., R. and N., D. Belnap, Entailment: the Logic of Relevance and Necessity, vol I. Princeton University Press, Princeton 1975. 17.
  • [2] Anellis, I. H., `Forty Years of “Unnatural” Natural Deduction and Quantification. A History of First-Order Systems of Natural Deduction from Gentzen to Copi’, Modern Logic, 2(2): 113-152, 1991.
  • [3] Belnap, N. D., `Tonk, Plonk and Plink’, Analysis 22/6:130-134, 1962.
  • [4] Bencivenga E., `Jaskowski’s Universally Free Logic`, Studia Logica, 102(6):1095-1102, 2014.
  • [5] Boolos, G., `Don’t eliminate Cut`, Journal of Philosophical Logic, 7:373-378, 1984.
  • [6] Boricic;, B. R., `On Sequence-conclusion Natural Deduction Systems`, Journal of Philosophical Logic, 14: 359-377, 1985.
  • [7] Borkowski L., J. S lupecki, `A Logical System based on rules and its applications in teaching Mathematical Logic`, Studia Logica, 7: 71-113, 1958.
  • [8] Brandom, R., Articulating Reasons. An Introduction to Inferentialism, Cambridge, Harvard University Press 2000.
  • [9] Cellucci, C., `Existential Instatiation and Normalization in Sequent Natural Deduction`, Annals of Pure and Applied Logic, 58: 111-148, 1992.
  • [10] Copi I. M., Symbolic Logic, The Macmillan Company, New York 1954.
  • [11] Corcoran, J. `Aristotle’s Natural Deduction System`, in: J. Corcoran (ed.), Ancient Logic and its Modern Interpretations, Reidel, Dordrecht 1972.
  • [12] D’Agostino, M., `Tableau Methods for Classical Propositional Logic` in: M. D’Agostino et al. (eds.), Handbook of Tableau Methods, pp. 45-123, Kluwer Academic Publishers, Dordrecht 1999.
  • [13] Dummett, M., The Logical Basis of Metaphysics, Cambridge, Harvard University Press 1991.
  • [14] Fine, K., `Natural deduction and arbitrary objects’, Journal of Philosophical Logic 14:57-107, 1985.
  • [15] Fitch, F.B., Symbolic Logic, Ronald Press Co, New York 1952.
  • [16] Fitch, F.B., `Natural deduction rules for obligation’, American Philosophical Quaterly 3:27-38, 1966.
  • [17] Fitting, M., Proof Methods for Modal and Intuitionistic Logics, Reidel, Dordrecht 1983.
  • [18] Fitting, M., First-Order Logic and Automated Theorem Proving, Springer, Berlin 1996. 18
  • [19] Garson, J.W. Modal Logic for Philosophers, Cambridge University Press, Cambridge 2006.
  • [20] Gentzen G., `Uber die Existenz unabhangiger Axiomensysteme zu unendlichen Satzsystemen`, Mathematische Annalen, 107:329-350, 1932.
  • [21] Gentzen, G., `Untersuchungen uber das Logische Schliessen`, Mathematische Zeitschrift 39:176-210 and 39:405-431, 1934.
  • [22] Gentzen, G., `Die Widerspruchsfreiheit der reinen Zahlentheorie`, Mathematische Annalen 112:493-565, 1936.
  • [23] Hazen, A.P., `Natural deduction and Hilbert’s epsilon-operator’, Journal of Philosophical Logic 16:411-421, 1987.
  • [24] Hazen A. P. and F. J. Pelletier, `Gentzen and Jaskowski Natural Deduction: Fundamentally Similar but Importantly Different`, Studia Logica, 102(6):1103-1142, 2014.
  • [25] Herbrand J., abstract in: Comptes Rendus des Seances de l’Academie des Sciences 1928, vol. 186, 1275 Paris.
  • [26] Herbrand J., `Recherches sur la theorie de la demonstration`, in: Travaux de la Societe des Sciences et des Lettres de Varsovie, Classe III, Sciences Mathematiques et Physiques, Warsovie, 1930.
  • [27] Hermes H., Einfuhrung in die Mathematische Logik, Teubner, Stuttgart 1963.
  • [28] Hertz P., `Uber Axiomensysteme fur beliebige Satzsysteme`, Mathematische Annalen, 101: 457-514, 1929.
  • [29] Indrzejczak, A., `Natural Deduction System for Tense Logics`, Bulletin of the Section of Logic 23(4):173-179, 1994.
  • [30] Indrzejczak, A., Natural Deduction, Hybrid Systems and Modal Logics, Springer 2010.
  • [31] Jaskowski, S., `Teoria dedukcji oparta na dyrektywach za lozeniowych` in: Ksiega Pamiatkowa I Polskiego Zjazdu Matematycznego, Uniwersytet Jagiellonski, Krakow 1929.
  • [32] Jaskowski, S., `On the Rules of Suppositions in Formal Logic` Studia Logica 1:5-32, 1934.
  • [33] Kalish, D., and R. Montague, Logic, Techniques of Formal Reasoning, Harcourt, Brace and World, New York 1964.
  • [34] Mates B., Stoic Logic, University of California Press, Berkeley 1953.
  • [35] Negri, S., and J. von Plato, Structural Proof Theory, Cambridge University Press, Cambridge 2001. 19
  • [36] Pelletier F. J. `A Brief History of Natural Deduction`, History and Philosophy of Logic, 20: 1-31, 1999.
  • [37] Pelletier F. J. and A. P. Hazen, `A History of Natural Deduction`, in: D. Gabbay, F. J. Pelletier and E. Woods (eds.) Handbook of the History of Logic vol 11, 341-414, 2012.
  • [38] Plato von J., `Gentzen’s proof of normalization for ND`, The Bulletin of Symbolic Logic 14(2):240-257, 2008.
  • [39] Plato von J., `From Axiomatic Logic to Natural Deduction`, Studia Logica, 102(6):1167-1184, 2014.
  • [40] Popper, K., `Logic without assumptions’, Proceedings of the Aristotelian Society 47:251-292, 1947.
  • [41] Popper, K., `New foundations for Logic’, Mind 56: 1947.
  • [42] Prior, A.,N. `The runabout inference ticket’, Analysis 21:38-39, 1960.
  • [43] Prawitz, D. Natural Deduction, Almqvist and Wiksell, Stockholm 1965.
  • [44] Prawitz, D. `Ideas and Results in Proof Theory’ in: Proceedings of the Second Scandinavian Logic Symposium, J. E. Fenstad (ed.), North-Holland, Amsterdam 1971.
  • [45] Quine W. Van O., Methods of Logic, Colt, New York 1950.
  • [46] Raggio A., `Gentzen’s Hauptsatz for the systems NI and NK`, Logique et Analyse 8:91-100, 1965.
  • [47] Restall G.,`Normal Proofs, Cut Free Derivations and Structural Rules’ Studia Logica, 102(6):1143-1166, 2014.
  • [48] Schroeder-Heister, P., `A Natural Extension of Natural Deduction’, Journal of Symbolic Logic 49:1284-1300, 1984.
  • [49] Schroeder-Heister, P., `Uniform Proof-Theoretic Semantics for Logical Constants (Abstract), Journal of Symbolic Logic 56, 1142, 1991.
  • [50] Schroeder-Heister, P., `Proof-Theoretic Semantics’ in: The Stanford Encyclopedia of Philosophy (ed.) E. N. Zalta 2012.
  • [51] Schroeder-Heister, P., `The Calculus of Higher-Level Rules, Propositional Quantification and the Foundational Approach to Proof-Theoretic Harmony’ Studia Logica, 102(6):1185{1216, 2014.
  • [52] Suppes P., Introduction to Logic, Van Nostrand, Princeton 1957, 20.
  • [53] Tarski A., `Fundamentale Begriffe der Methodologie der deduktiven Wissenschaften`, Monatschefte fur Mathematik und Physik, 37:361-404, 1930.
  • [54] Troelstra A. S. and H. Schwichtenberg., Basic Proof Theory, Cambridge 1996.
  • [55] Vigano L., Labelled Non-Classical Logics, Kluwer 2000.

 

Author Information

Andrzej Indrzejczak
Email: indrzej@filozof.uni.lodz.pl
University of Lodz
Poland

Pierre Bayle (1647–1706)

BaylePierre Bayle was a seventeenth-century French skeptical philosopher and historian.  He is best known for his encyclopedic work The Historical and Critical Dictionary (1697, 1st edition; 1702, 2nd edition), a work which was widely influential on eighteenth-century figures such as Voltaire and Thomas Jefferson.  Bayle is traditionally described as a skeptic, though the nature and extent of his skepticism remains hotly debated.  He is best known for his explicit defenses of religious faith against the attacks of reason, for his attacks on specious theological doctrines, and for his formulation of the doctrine of the erring conscience as a basis for religious toleration.

In contrast to his seventeenth-century contemporaries, Bayle is fundamentally an anti-systematic thinker. In keeping with his skepticism (understood in the ancient sense), he is committed to the thorough examination of arguments for and against the position under examination.  This entails making the best arguments possible on both sides, as well as raising the strongest possible objections to both sides.  As a result, in many cases, it is difficult to determine just what Bayle’s position is.  Commentators refer to this phenomenon as the “Bayle enigma,” and it affects virtually every area of Bayle’s thought, undermining the legitimacy of his defenses of religious faith and calling into question the sincerity of his attacks on theology.

Bayle’s influence extends beyond philosophers; his texts have occasioned interest from historians, theologians, literary scholars, and political theorists.  Bayle was incredibly prolific, both in personal correspondence and in published work.  The encyclopedic format of his Dictionary showcased the dazzling breadth and depth of his knowledge, a learning which was also on display during his years as the editor of the intellectual journal News from the Republic of Letters (1684-1687).  Bayle produced most of the content of the journal—primarily book reviews—during his editorship.  His authorship of anonymous works has also been established, most recently in the case of the Important Advice to Refugees (1690).  The enormous variety of topics that Bayle treated over the course of his lifetime, the diversity of formats that he used to do so, and the indeterminate nature of his arguments make him a rich topic for scholarly investigation.

Table of Contents

  1. Biography and Intellectual Context
  2. Anti-Systematicity
  3. Skepticism
    1. What Kind of Skeptic was Bayle?
      1. The “Surreptitious Atheist” Reading
      2. The “Christian Fideist” Reading
    2. Moral Knowledge
  4. The Problem of Evil
  5. The Erring Conscience and Religious Tolerance
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography and Intellectual Context

Bayle was raised as a French Calvinist, or Huguenot, from his birth in 1647 in Le Carla, a small village in the south of France, until he left for the Jesuit college in Toulouse.  His father, a Huguenot pastor, and his family were astonished by his 1669 conversion to Catholicism, presumably as a result of his studies under the Jesuits at Toulouse.  Bayle reconverted to Calvinism eighteen months later, however, officially becoming a rélaps, the most persecuted religious classification under the French Catholic monarchy.  Predictably, Bayle then fled France, and studied at a Calvinist seminary in Geneva for two years under Louis Tronchin.  After Bayle figured out that the pastoral vocation was not for him, he transferred to the University of Geneva to study Cartesian philosophy.  After completing his studies there and returning to France in disguise as “Bâle” in 1674, Bayle spent a year as a tutor in Rouen and Paris before securing a position in 1675 at the Protestant Academy of Sedan.

It was at Sedan that Bayle first came into contact with Pierre Jurieu, a Calvinist theologian who became Bayle’s mentor, but over time, his most bitter enemy.  Bayle and Jurieu initially were so close that when the French government closed the Sedan academy in 1681, Bayle followed Jurieu to the Ecole Illustre, an academy in Rotterdam for Huguenot refugees where they both joined the faculty.  Their mutual animus likely had its genesis in Bayle’s refusal of a marriage arranged by the Jurieu family, but there were also intellectual reasons for the cooling of Bayle and Jurieu’s relationship.  The publication of Bayle’s Philosophical Commentary (1686-88), which advocated religious toleration, had already raised Jurieu’s suspicion of Bayle.  The animosity increased markedly in 1690, when Bayle’s anonymously-published Important Advice to Refugees occasioned heated attacks by Jurieu, who saw the work as profoundly anti-Protestant.

During his initial years in Rotterdam, almost all of Bayle’s writings had been focused on attacking Catholic theology and practice, including General Critique of Maimbourg’s History of Calvinism (1682), Diverse Thoughts on the Occasion of a Comet (1683), and An Entirely Catholic France (1686).  The death of Bayle’s father and brothers in 1684 and 1685, and the Revocation of the Edict of Nantes in 1685, provided strong personal reasons for Bayle to attack Catholic intolerance.  Jurieu saw the Advice to Refugees, however, as evidence that Bayle had turned against his Huguenot roots, and denounced Bayle as a heretic.  Jurieu’s public proclamations against Bayle, however, were inconsistent with Bayle’s fidelity to the Reformed community in Rotterdam, and evidence from Bayle’s deathbed seems to support his adherence to the Calvinist religion for the rest of his life.

The text that solidified Bayle’s reputation as a grave danger to religious belief, however, was his Dictionary, the encyclopedic work that was Bayle’s magnum opus.  The Dictionary contains many articles that implicitly criticize his Protestant contemporaries, including Jurieu, as well as articles that seem to undermine the rationality of religious belief as a whole.  Bayle clarified his criticisms in the second edition of the Dictionary in 1702, which included “Eclaircissements”, or Clarifications, on several of the most controversial articles.  These explanations did not deflect criticism, however, and Bayle provided even more fodder for his critics with the publication of his Response to the Questions of a Provincial (1704) and Continuation of Diverse Thoughts (1705).  These late works contain reconstructions of coherent atheist positions, and support Bayle’s earlier position from Diverse Thoughts that atheists could be morally upright.  Bayle continued to respond to his critics until the day of his death on December 28, 1706.  That day, he wrote in a letter to a friend, “I am dying as a Christian philosopher, convinced of and pierced by the bounties and mercy of God.”

Despite this final piece of evidence toward Bayle’s religious fidelity, many Enlightenment philosophes in the generations following Bayle saw him as their intellectual ancestor.  One of Bayle’s most famous admirers was Voltaire, who is probably most responsible for Bayle’s reputation as the “arsenal of the Enlightenment,” a reference to the many arguments that the philosophes found in Bayle.  The philosophes adapted these arguments to attack religious and superstitious beliefs among philosophers and theologians, using the arguments to show the absurdity of any supernatural belief whatsoever.  The Enlightenment portrait of Bayle has defined his place in intellectual history, until the more recent interpretations of the twentieth century.

2. Anti-Systematicity

In the history of the early modern philosophy, Bayle is one of the most controversial, and least understood, intellectuals of the period.  Unlike other canonical seventeenth-century figures, Bayle gave no explicit systematization of his philosophical positions.  While Bayle wrote on philosophical and theological problems ranging from toleration to the problem of evil, he produced no definitive or complete exposition of his ideas.  Despite the widespread popularity of his Dictionary, Bayle is typically not considered to be a canonical philosopher.  This is perhaps because the philosophical insights in Bayle’s work are buried in theological polemic, obscure reference material, and extremely prolix arguments.  Relatively few scholars have taken on the difficult task of mining these insights until recently.

The Dictionary, one of the most problematic texts of the early modern period, is the obvious place to begin any interpretation of Bayle’s thought.  Bayle’s stated purpose in writing the Dictionary was to update and correct the work of Louis Moréri’s Grand Historical Dictionary (1674).  Bayle thought that Moréri’s dictionary was hopelessly out of date and inaccurate, and Bayle hoped that his work would replace Moréri’s as a standard reference work.  The Dictionary, however, is neither objective nor exhaustive, at least by today’s standards.  The majority of the Dictionary’s pages are not even devoted to the scholarly articles themselves, but rather to remarks and footnotes that Bayle uses to articulate his own thoughts on the topics of the articles – or even on other topics that are only tangentially related to the topic of the article.  Furthermore, Bayle routinely makes mutually inconsistent claims throughout the Dictionary.

It is not just the underdetermined, dense, and paradoxical, nature of the Dictionary that poses an interpretive problem for scholars of Bayle; the problem is magnified when one examines Bayle’s corpus as a whole.  The breadth and complexity of his work is dizzying; Bayle’s writing ranges over a wide variety of topics and genres, from superstition to Biblical exegesis to astronomy to metaphysics, and from historical critiques to literary reviews to journal articles to theological treatises.  Elisabeth Labrousse, an internationally regarded scholar of Bayle, notes that “[a]t turns, Bayle speaks the language of a Calvinist theologian, a Huguenot pamphleteer, a disciple of Malebranche, or a spiritual child of Erasmus, Montaigne, and Naudé.”  Furthermore, Bayle’s scholarship on all of these topics and in all of these genres was exhaustingly thorough.  His scholarly training at Toulouse taught him to examine not just his own position on a particular issue, but also to examine all possible objections and replies to his position, in as much detail as necessary to demolish his opponent.  His arguments cite both the relevant historical and contemporary sources, a testament both to his encyclopedic mind and to his lifelong obsession with the intellectual trends of his day.  Bayle’s arguments are so intricate that it is often unclear exactly what positions the arguments are supposed to be defending.  As Jean Delvolvé, an early twentieth-century scholar of Bayle, aptly notes,

The very originality of Bayle’s ideas, their lack of systematic construction, their diffusion in the mass of a work that is prolix to excess, their intentionally obscured and enveloped exposition – for they must be discovered through a thousand réticences, and among the trompe-l’oeil of affirmations to the contrary – all these reasons hindered the comprehension of Bayle by his contemporaries and have hindered him taking his rightful place in the history of human thought.

The paradoxes of Bayle’s work have given rise to a number of different readings of Bayle.   First, the complexity and seeming ambiguity of Bayle’s arguments have been cited as evidence that Bayle ought to be read primarily as an ironic critic.  According to this reading of Bayle, all of his arguments that ostensibly defend traditionalist positions are really just vehicles for proto-Enlightenment critiques of those same positions.  The completeness of Bayle’s arguments, and his dedication to charitable reconstruction of his opponent’s arguments, is not evidence of Bayle’s responsible scholarship, but is rather a chance for him to advance his own subversive views.  That these views are in fact Bayle’s is supported by the paradoxical replies and weak counterarguments that he offers in response to the charges of his opponents.  According to this, Bayle’s apparent acceptance of what seem to be obviously anti-intellectual paradoxes by an otherwise philosophically sophisticated mind provides support for reading Bayle as a kind of subversive anti-traditionalist.

An alternative reading of Bayle is as a kind of complicated traditionalist.  The complex structure of Bayle’s arguments reflects not subversive critique, or even rational agnosticism, according to this reading.  Instead, it reflects Bayle’s desire to demonstrate for his opponents, via a reductio ad absurdum, the paradoxes of reason with respect to metaphysics in general, and with respect to philosophical theology in particular.  This reductio of reason provides an explanation both of Bayle’s use of rigorous philosophical argumentation, and of his explicit affirmation of apparent paradoxes.  This reading of Bayle as a philosopher who uses reason to disarm itself is consistent not only with his commitment to responsible argument, but also with the evidence of his lifelong adherence to traditional Huguenotism.

Recent readings of Bayle have resisted even attempts to make him into either an ironic critic or a complicated traditionalist.  This anti-systematic reading of Bayle recognizes the multiple ambiguities and difficulties inherent in any attempt to provide a systematic interpretation of Bayle.  According to this reading, the nature of Bayle’s texts prohibits fixing any sort of singular interpretation to his thought.  What is most distinctive about Bayle’s thought is not its irony or traditionalism, but rather its dialogic character and polyphonic thinking.  Bayle’s texts consistently allow multiple voices to speak autonomously, rather than as vehicles for his own views; it is thus a grave interpretive error, on this reading, to impose an artificial systematization on a text to create a single voice or interpretation.  In other words, the typical temptation to force internal consistency onto Bayle’s texts – even a skeptical consistency – would not just be a hermeneutic mistake; it would be a philosophical one, because it would require the pursuit of consistency between arguments defending opposing positions.

3. Skepticism

a. What Kind of Skeptic was Bayle?

Reading Bayle as a skeptic of one kind or another has a long history, going back to his own contemporaries and continuing through present-day commentators.  The sense in which Bayle is a skeptic is not entirely straightforward, but what is clear is that Bayle exhibits a profound suspicion of reason’s ability to deliver certain knowledge.  In Bayle’s view, reason seems to be useful in enabling us to draw conclusions about the world, but it runs into so many contradictions and yields so many paradoxes that it ultimately undermines itself, and thus cannot be trusted.  Thus, Bayle’s skepticism is, minimally, skepticism about the reliability of reason.  Aside from this point, however, interpreters of Bayle diverge about the nature and extent of Bayle’s skepticism.  How best to understand Bayle’s skepticism is often a function of the more general reading that one takes of Bayle’s overall projects and positions.

i. The “Surreptitious Atheist” Reading

Taking its cues from the “ironic critic” reading of Bayle, this interpretation of Bayle’s skepticism sees it as fundamentally a kind of Stratonianism, a position that Bayle outlines in the Continuation of Diverse Thoughts (1705).  Strato, the position’s namesake, was the third leader of the ancient Lyceum, after Aristotle and Theophrastus.  Unlike other ancient philosophers, Strato is uncompromising in his atheism.  Bayle himself is interested less in the position advocated by Strato himself than in a modern adaptation of Stratonianism.  This is because Strato represents for Bayle the position of seventeenth-century libertins: the denial of a providential God, and the affirmation of the eternity and infinity of the universe.

The case that Stratonianism represents Bayle’s own philosophical position is not found in Bayle’s arguments themselves, but rather in a methodological feature of their structure.  Bayle typically structures his arguments not to support directly the position he actually holds; rather, he constructs the best possible argument for the strongest opposing position, and then defeats it later.  This eventual defeat makes evident the superiority of the position Bayle actually holds.  Bayle explicitly develops the position of the Stratonian atheist over the course of the Continuation, and, according to this reading, this position is never refuted by Bayle.  Thus, the strongest opposing position to natural philosophical theology is left standing as a menace to theist philosophers.  Reading Bayle in this way assumes that if Bayle’s position were not that of the Stratonian atheist, then he would have provided more decisive objections; in the absence of those objections, Bayle is implying that Stratonianism is the only philosophically defensible position.

ii. The “Christian Fideist” Reading

Taking its cues from the “complicated traditionalist” reading of Bayle, this interpretation of Bayle’s skepticism sees it as a kind of fideism.  Bayle’s (heterodox) Calvinism, and the context of Cartesianism and Protestant theology more generally, is taken as fundamental to his thought.  According to this reading, the complex structure of Bayle’s arguments reflects not an implicit atheism, but rather his desire to demonstrate the paradoxes of reason with respect to metaphysics, and with respect to the metaphysical claims of religion in particular.  This demonstration of the paradoxes of reason provides a basis both for Bayle’s affirmation of Calvinist theology, and for his use of rigorous philosophical argumentation.  This reading is thus consistent not only with his commitment to responsible argument, but also with his apparent lifelong adherence to the Calvinist faith.

Textual evidence for this reading is Bayle’s furious reply to the Jesuit father Maimbourg’s History of Calvinism (1682).  Bayle wrote his reply – General Critique – in two weeks, and in it, Bayle makes clear both his Protestant convictions and his commitment to them.  Bayle emphasizes that since the workings of Providence are infinite, they could not be comprehended by finite reason.  However, French Calvinism contains strong elements of Cartesianism, and Bayle himself asserts in Diverse Thoughts that his views were not far from those of Malebranche.

Ultimately, though, this reading holds that Bayle’s pessimistic assessment of reason is what characterizes the bulk of his work.  Throughout Bayle’s journal News from the Republic of Letters, he makes critical remarks about the arguments of secular rationalists, and these remarks indicate that all rational investigation of theological or philosophical questions results in puzzles that reason is powerless either to affirm or deny.  Bayle also remarks in his Dictionary that “there is no contradiction between these two things: (1) the light of reason teaches me that that is false; (2) Moreover, I believe it because I am persuaded that this light is not infallible and because I prefer to defer to the proofs of sentiment and to the impressions of conscience, in a word, to the word of God, than to defer to a metaphysical demonstration” (“Spinoza,” Rem. M).  This is evidence not only of Bayle’s sincerity in his faith, but also of his confidence in the coherence of his religious and philosophical views.

b. Moral Knowledge

Bayle’s account of moral knowledge rests on a function of reason that he calls la droite raison, or right reason.  Despite his skepticism, Bayle seems to hold that what he calls the “common notions” of morality are well-grounded insofar as they come from right reason.

The most famous example of a “common notion” delivered by right reason is found in Bayle’s Philosophical Commentary, where he argues that the interpretation of Scripture must be limited by the “clear and distinct notions of the natural light… with respect to morality” (I.i).  This conclusion initially appears to be quite heterodox; if read in its most radical form, it seems to imply that any Christian doctrine that is refuted by reason (“the natural light”) is false.  What Bayle actually asserts here, however, is not the falsity of any Christian doctrine that is against reason; rather, he asserts only the falsity of particular dogmas that are purported to be in Scripture.  For Bayle, the “natural light” reveals the immorality of the forced conversions for which Catholics purported to find justification in Scripture, and their immorality invalidates their purported justification.  This highlights the most important consequence of the passage: that the natural light trumps the claims of dogma with respect to morality.  Bayle’s skepticism entails that the natural light is fallible, and can be self-contradictory in some domains.  It appears, however, that the natural light is reliable with respect to moral truths – at least, with respect to those that apply to humans.

Bayle reiterates the reliability of the natural light with respect to moral truths consistently throughout the Philosophical Commentary, which is unsurprising since the text is a defense of the morality of religious toleration.  This position, however, appears in other texts as well.  In Diverse Thoughts, wherein Bayle argues that atheists can be moral, he notes that certain moral principles are not only rational, but that moral praise and blame can be rationally assigned to those who live accordingly.

Bayle argues that the atheist has access to right reason, which confirms basic moral truths.  Bayle also provides examples of the specific basic moral truths in question: “it is rational to respect one’s father, to hold to one’s word, to console the afflicted, to help the poor, to have gratitude for one’s benefactors, etc.” (OD III 406a).  There is no hint of any of the skeptical doubts that Bayle characteristically raises; this suggests that he is using a non-skeptical notion of reason when discussing basic moral beliefs.

One of the final texts of Bayle’s life, Response to the Questions of a Provincial (1704-1707), also offers evidence of Bayle’s insistence on the rational accessibility of moral truths.  Bayle’s position there is that atheists can be moral because they can know the conformity of virtue with right reason.  He concedes that if this were not true – that is, if morality were only clearly conceivable through revelation – then atheists could not be moral.  According to Bayle, however, right reason is as universal as the principles of logic.  Bayle’s point in RQP is not to highlight the universality of the principles of logic, but simply to note that if one acknowledges the authority of principles of logic, then the sort of reason at issue here – right reason – should enjoy the same privileges.  Other passages in RQP call into question the universality of right reason, particularly in rendering moral judgments about the conduct of God, but not with respect to human conduct.

Bayle’s Abridged System of Philosophy (1675-1680), which are lecture notes from his first position as a professor at Sedan, are where he provides his most systematic treatment of the notion of right reason.  In the section of notes on moral theory, Bayle defines right reason as “the judgment that the soul naturally renders on practical conclusions, or conclusions regarding morality that are drawn from practical principles” (OD IV 261b).  Bayle thus restricts the scope of right reason to moral, or practical, principles.  Unlike the merely plausible conclusions of a skeptical conception of reason, Bayle argues that the natural light of reason – which Bayle uses interchangeably with right reason, when the natural light is illuminating practical matters – suffices to know moral truth.  The principles of morality that are known by right reason are universally and evidently true.  Bayle argues, further, that right reason is also the standard by which the goodness of particular actions are judged (OD IV 261b).

There is a significant complication in Bayle’s account of moral knowledge, however; in the midst of a discussion on right reason, he introduces the notion of conscience.  Bayle defines conscience as

a practical judgment of the understanding, which dictates to us that we must do or ought to have done something, as being praiseworthy, and that we must avoid or ought to have avoided something, as being shameful.  In a word, it’s an understanding of the natural law by which each person judges which thing is praiseworthy & ought to be done, and which other thing is shameful & ought to be avoided (OD IV 261b).

This sounds very similar to Bayle’s description of the guidance offered by right reason.

Further, Bayle’s account of moral knowledge is complicated even more by his use of illumination language to describe the conscience: he claims that the “natural light” leads us to affirm the principles of morality.  He initially refers to natural morality itself as a “certain light in the soul” that obliges the recognition of general principles of morality.  He also, however, makes reference to the light by which we affirm the principles of morality, and which supposedly lead us to natural morality.  There seems to be a distinction, then, between “natural morality” (“the first general principles of morality”), which is a certain light, and the “natural light” of conscience – non-identical to the “natural light” of reason – for which the standard is not praiseworthiness, but rather fairness.  Further, those led by conscience are merely supposed to have natural morality.

There is a clear connection for Bayle, then, between right reason as the faculty that grounds moral knowledge, and our rational nature – or at least the leftovers of our prelapsarian rational nature.  Unfortunately, it also opens the possibility that the obligation of one’s conscience could attach to moral beliefs that were erroneous, or that were in some way contrary to the dictates of right reason, if the conscience were not being guided by right reason.  Right reason is a crucial check on the moral “knowledge” provided by conscience in the following ways.  First, the conscience can be affected by prejudices and errors, and unless it is rid of those, it cannot function as a moral guide.  Relatedly, as a result of its susceptibility to prejudice and error, a conscience can be falsely persuaded of the licitness or illicitness of a particular action.  Finally, one whose conscience is falsely persuaded can still commit acts that are in conformity with right reason, even though her erring conscience says that such acts are illicit.  Similarly, a person who commits a wrongful act deemed by his erring conscience to be licit is still acting against right reason, despite the conformity with conscience.  Thus, while conscience delivers verdicts on the morality of particular actions by particular individuals, right reason is the ultimate arbiter of morality in general.  This provides a significant external check on the potentially erring conscience.

4. The Problem of Evil

Bayle’s treatment of the problem of evil is well-known, and occasioned Leibniz’s writing of the Theodicy (1710).  Bayle’s Dictionary articles on the Paulicians, the Manicheans, and the Marcionites, as well as his subsequent clarification on the “Paulicians” and “Manicheans” articles, are where Bayle develops the position to which Leibniz is responding.  Bayle also treats the issue in Response to the Questions of a Provincial and Dialogues of Maximus and Themistius (1707), where he critiques rationalist responses to the problem of evil.  Bayle is pessimistic regarding the use of reason to make sense of evil: he holds that a priori reasons fail to address the a posteriori reality of evil.  In other words, any attempt to explain the existence of evil rationally is contradicted by lived experience.  Bayle supports this position by showcasing the strengths and weaknesses of both the orthodox and the Manichean solutions to the problem of evil, and concludes that both positions fail.  What’s more, the failure of these solutions is not merely beyond the ken of human reason; the proposed solutions are comprehensible to reason, but simply fail its evaluation.

Bayle’s first extensive treatment of the problem of evil is in the Dictionary.  In particular, the articles on the Manicheans and the Paulicians provoked a strong response from his fellow Huguenot refugees in Holland, prompting him to write a clarification of his position in those two articles for the second edition of the Dictionary.  In Remark D of “Manicheans,” Bayle considers two different responses to the problem of evil, using the personage of Zoroaster on the one hand, and Melissus of Samos on the other.  Bayle frames their positions in terms of a priori and a posteriori reasons.  According to Bayle, the rational notions of order are what naturally lead us to think that an eternal, self-existent, and necessary being must also possess omnipotence and omnibenevolence.  According to Bayle, this is an instance of an a priori reason: the ideas therein are clear and distinct, and it is internally coherent.  With respect to the problem of evil, however, a priori reasons are merely the beginning of the discussion; this is because evil is a phenomenon – it is experienced.  This entails that, according to Bayle, a posteriori reasons are also relevant; whatever conclusion that is supported by a priori reasons – that of a single unifying principle – may or may not be the same conclusion supported by a posteriori reasons.

Bayle imagines a debate between Melissus and Zoroaster in which they examine the pros and the cons of both the proposed solutions to the problem of evil, with Melissus defending the single unifying principle, and Zoroaster defending the existence of two principles, one evil and one good. Melissus holds that a priori reasons favor the existence of a single unifying principle, and Zoroaster agrees that Melissus surpasses him “in the beauty of ideas and in a priori reasons” (305b).   Zoroaster challenges Melissus, however, to explain the source of the evil caused by humankind, and argues that the existence of two principles better explains this phenomenon; it provides better a posteriori reasons than a single unifying principle. Even when Melissus argues that physical evil is simply a response of God’s justice to moral evil, Zoroaster replies that humankind’s inclination to evil is a defect that could not be caused by a single unifying principle with every perfection.  Melissus’ final attempt to blame humankind for evil fails, according to Zoroaster, because even the freedom that Melissus claims for humankind is not truly free, since it exists completely by the action of God.  Zoroaster argues that it is inconsistent with a priori reasons that a single, omnibenevolent principle would not only fail to prevent moral evil, but would then punish humankind with physical evil for the moral evil that they commit – but for which the single principle is still ultimately responsible.

There is a rational intractability, then, in Bayle’s conception of the problem of evil: a priori reasons contradict a posteriori evidence, and yet the solution that best accounts for the a posteriori evidence – the “two principles” solution – is inconsistent with a priori reasons – particularly with the notion that a single omnibenevolent principle could in any way be the origin of evil. The intractability of the problem forces Bayle to propose an entirely different strategy: the only way out of the rational dilemma of evil is to look beyond the contradictions of reason to the realm of facts.  (By “facts,” Bayle means something like “that which is found in Scripture”.)  In the case of the problem of evil, the relevant “fact” is the evidence of Scripture that an omnibenevolent, holy, and omnipotent God has either allowed or caused evil to exist. Further, as revelation, Scripture is not merely additional a posteriori evidence; it has the added epistemological weight of faith.  The actuality of this state of affairs – the coexistence of this kind of God with evil – is enough to counter the objection of impossibility, according to the principle of logic: “From the actual to the possible is a valid inference.”  This factual strategy for addressing the problem of evil is consistent throughout the rest of the Dictionary, and is consistent with Bayle’s continual insistence in the Dictionary on the supremacy of revelation (“faith”) in the face of rational challenges.

Though the Dictionary is the most famous place where Bayle engages the problem of evil, his last two works, Response to the Questions of a Provincial and Dialogues of Maximus and Themistius, contain an extensive treatment of related issues as well.  Bayle’s targets are many in these works, but one of the central ones is Isaac Jacquelot, a Reformed theologian who defends a theodicy-type position.  Jacquelot was one of the Huguenot rationaux, a group of intellectuals defined by Calvinist theological commitments and broadly Cartesian philosophical ones.  Jacquelot was deeply engaged in the project of rational theology, and had a fruitful intellectual history with Bayle.  Jacquelot was profoundly influenced by Malebranche, particularly in the divine omniscient governance of nature, and the sinful effects of free will.  The common interests of Malebranchean philosophy and Huguenot theology make Jacquelot an excellent interlocutor for Bayle.  Bayle’s proposed explanation of the problem of evil remains essentially unchanged from his position in the Dictionary: that ultimately, it is futile to argue a priori reasons against the fact of the coexistence of God’s nature with evil.

Bayle’s proposed solution to the problem of evil reappears in Response to the Questions of a Provincial as part of a debate about free will.  Since a hallmark of Reformed theology is the total sovereignty of God over creation, it is difficult for any reformer to hold that the freedom granted to humankind can clear God of responsibility for the evil acts of his creatures.  If God is truly sovereign, then he would have some kind of governance over the choices of humans – minimally, he would have foreknowledge of the choices causally connected to the existence of evil, and thus foreknowledge coupled with omnipotence seems to entail a responsibility for God to act such that evil does not come into existence.  If this is true, then God is in fact responsible for the existence of evil just insofar as he has not prevented it.  Bayle never denies any part of this argument; he seems unwilling to look over or explain away its various premises in the way that his predecessors and contemporaries do.

Bayle’s original proposal for addressing the coexistence of God and evil, however, is consistent with this line of argumentation.  As in the Dictionary, Bayle advocates in RQP a “factual” approach to the intractability between God’s omnibenevolence and evil: Scripture declares that this coexistence is so, and it is nonsensical for reason to argue against a matter of fact.  Bayle also explicitly refuses the proposal by Jacquelot that the incompatibility is simply above reason by rejecting the “above reason/against reason” distinction.  According to Bayle, there is no such thing as “above reason” when the reason at issue is human reason: either an axiom is compatible with human reason, or it is against human reason.  If something appears not to conform to human reason, then by definition, Bayle argues, it also appears contrary to it.

One objection to this reading of Bayle is that in fact, there is not much difference between Bayle’s position and the “above reason” position –the two positions in fact represent a distinction without a difference.  If Bayle ultimately endorses belief in the coexistence of God and evil in the face of apparent contradiction, the objection goes, he is at least implicitly endorsing some truth that is beyond human reason. The point of true disagreement, however, is that according to the “above reason” position, what is above human reason is still consistent with human reason, though incomprehensible to it.  When one considers the divine mysteries, however, it is obvious that, to the extent that they are comprehensible by human reason, they run contrary to it.  The doctrine of the Trinity runs contrary to the laws of mathematics; the doctrine of the Incarnation runs contrary to our conception of an object’s ability to have more than one nature; and the doctrine of Jesus’ bodily resurrection runs contrary to our conception of the nature of physical bodies.  These conflicts are within the realm of human reason, not above it, and though the mysteries are not fully explicable – thus “mysteries” – they are comprehensible enough to make the conflict a real one, not merely apparent.

In the Dialogues of Maximus and Themistius, Bayle is careful to restrict his rejection of the “above reason/against reason” distinction to the scope of human reason.  This is because the problem of evil is so repugnant to human reason that the only possible response to it must completely throw out the conclusions of human reason.  Bayle challenges Jacquelot to explain how God’s allowing evil could ever be adequately explained using human reason.  According to human reason, Bayle argues, God’s allowing evil to exist violates a priori reasons and our idea of God as omnibenevolent.  The position here is essentially that of the Dictionary, and Bayle’s reiteration of it in the Dialogues seems to show that he is unimpressed with Jacquelot’s proposed solution to the problem of evil.

According to Bayle, the specific problem with Jacquelot’s proposed solution to the problem of evil is that Jacquelot accepts divine foreknowledge.  Presumably, Jacquelot’s retention of divine foreknowledge is supposed to support the possibility of a free will defense.  Bayle notes, however, that  divine foreknowledge is actually not all that helpful: even with divine foreknowledge, the existence of evil calls into question God’s omnibenevolence, since a being who foresees the negative consequences of free will cannot have good intentions if he persists in bestowing it on humans.

These objections support Bayle’s assertion in the Dialogues that his solution to the problem of evil is really the last left standing: believing, in spite of lacking an understanding of how God’s omnibenevolence is compatible with evil.  Importantly, for Bayle, this belief is not grounded in the faculty of reason, but rather on the declaration of Scripture that God and evil in fact coexist.  Bayle’s later works trend toward a kind of moral rationalism with respect to human conduct, but his advocacy of this factual solution to the problem of evil never changes throughout his life, and his debate with Jacquelot on the problem of evil does not undermine the tenability of his position.  Divine conduct is simply not susceptible to the judgments of right reason.

5. The Erring Conscience and Religious Tolerance

Bayle’s concern with conscience and toleration is not limited to the Philosophical Commentary, but it is where Bayle most clearly argues for religious toleration.  He articulates two lines of argument for religious toleration: one on the basis of his doctrine of the erring conscience, as developed in the General Critique (1682) and the New Letters (1685); and one on the basis of a principle of the natural light according to which any reading of Scripture that implies a moral crime is a false reading.  For Bayle, both ways of arguing for religious toleration are necessary in order to prevent coercion of, or by, people who act on the basis of conscience – whether that conscience is accurate or erring.

Bayle’s argument for religious toleration based on his doctrine of the erring conscience assumes that we have a duty and a right to act according to the lights of conscience.  This is a less controversial claim when the beliefs of conscience are accurate; however, Bayle’s doctrine of the erring conscience entails that even when the beliefs of conscience are in error, the same duties and rights of conscience obtain.  Bayle does place some conditions on the erring conscience’s acquiring these duties and rights; only when the erring conscience is “in good faith” – that is, when the error is sincere – does the erring conscience obtain the relevant rights and duties.  Bayle consistently holds to the “good faith” requirement in both the New Critical Letters and the Philosophical Commentary; in the New Critical Letters, he writes that “[a]ll good faith errors have the same right over conscience as orthodoxy, whether we embraced those errors a bit too lightly, or whether we ran them through the most rigorous examination that we could manage.”   Bayle places the good faith errors of the sincere lay person on the same footing as the good faith errors of the rigorous intellectual – and, most significantly, on the same ground as orthodoxy.

This allows Bayle to affirm a kind of moral equivalence between the accurate conscience and the erring one: whatever rights and duties accrue to an accurate conscience also accrue to the erring conscience.  Thus, if the beliefs of the accurate conscience ought to be tolerated, so ought the beliefs of the erring conscience.  Bayle marshals several different arguments for the moral equivalence claim, but the most powerful is the argument from skepticism.  Presumably, each person cannot help but think that her conscience is in the right in cases where beliefs of conscience conflict.  In the absence of definitive and objective proof for a belief of conscience, then, there is no reason to grant one conscience rights and duties over another.

A serious potential problem arises with respect to the doctrine of the erring conscience, however: the issue of fanaticism.  Assuming that an erring conscience has all of the same duties and rights as an accurate conscience, what’s to prevent an individual from acting on a fanatical conscience?  Bayle says in the Dictionary that it is the fanatics – the people who would benefit the most from the doctrine of the erring conscience – who support the principle that acting against one’s conscience can be a good.  Bayle thus conceives of fanatics as the sort of people who are willing to subvert morality, and even the rights of their own conscience, in order to undercut the rights of others.  True fanatics, however, often do not recognize that they are doing so, since they are typically convinced that they are the only people who perceive truth for what it really is.  If a fanatic is convinced of his correctness – that is, that the lights of his conscience are accurate – then he will apply to himself whatever is said in favor of truth against those whom he perceives to be in error.  The fanatic shifts the burden of falsity to those with whom he disagrees as a way to discharge doubt or discomfort, while simultaneously creating a double standard: an act is permissible when I do it, but not when others do.  What fanatics fail to grasp when they argue for the rights of truth (presumably in order to justify the persecution of those whom they believe to be in error) is that if the roles were reversed – if the fanatics were in the minority – they would no doubt be arguing in favor of religious toleration.

This leads to Bayle’s second argument for religious toleration based on the principle of the natural light articulated in the Philosophical Commentary that forbids the commission of crimes.  Bayle’s moral principle against committing crimes supports his defense of the doctrine of the erring conscience: if the accurate conscience did indeed have the right to coerce, it would only be a right considered from an abstract point of view.  According to Bayle, the abstract point of view is not that of conscience; conscience provides direction for the particular beliefs and actions of a particular person.  Setting aside the abstract point of view, the only way to justify coercion is by appeal to the conscience itself, whose accuracy is exactly the issue at hand.  Since the only justification available to conscience is the force of its persuasion, then if the true religion were ordered by God to persecute heretics, heretics would also have the right to persecute the true religion.  This scene of rampant persecution is the epitome of moral breakdown, and Bayle thinks that no such situation can be justified with an appeal to Scripture – or to conscience.  Religious coercion is not only morally villainous, but it violates the very heart of all religions – and most importantly for Bayle’s readers, it violates the heart of Christianity.

Bayle’s principle of the natural light – that no reading of Scripture can be true that justifies the commission of moral crimes – adds thus moral disapprobation to any conscience-based sanction against coercion.  It also provides a principle upon which those of differing consciences can agree.  The revelation of the natural light that Bayle cites here – that committing crimes is always immoral no matter what the justification – comes from the faculty of right reason.  Bayle argues in Diverse Thoughts that this faculty of reason, responsible for intuiting certain basic rational moral maxims, is equally accessible to both atheists and believers – whether heretical or orthodox.  This implies that everyone is subject to these same moral maxims, including the absolute prohibition on using conscience as a motive to justify committing crimes.  (Note, however, that this principle of the natural light only governs action – that is, it prohibits committing crimes, which is the realm of action.  It gives no clear doxastic guidance outside of these basic moral maxims.)

This principle of natural light thus separates religious beliefs, where Bayle is rather permissive, from basic moral beliefs, where only right reason has sway.  There are two major benefits to this separation.  First, it allows Bayle to maintain that all individuals of every confession – or no confession –are subject to the same basic moral maxims, which apply equally to everyone with access to the “natural light” of right reason.   Second, it allows Bayle to maintain that we may still have good reason to condemn beliefs of those with an erring conscience, but that rather than condemning those who believe erroneously, we should condemn those who profess to be in good faith but are not – a sin not merely of belief, but of action.  Bayle specifically tackles this issue in his Dictionary article on Arius.  The group for whom Bayle reserves his strongest condemnation in that article is not heretical teachers that are in good faith, instructing people in a simple way in accord with the teachers’ conscientious beliefs.  Rather, his strongest words are for heretical teachers who teach heresy without believing it; he calls them “monsters of ambition and malice.”  Presumably, the force of Bayle’s condemnation rests not on the heresy of such teachers, but on their hypocrisy – the discrepancy between belief and action.

Interestingly, for all of Bayle’s emphasis on right action over right belief, he still leaves room for a distinction between valuable and worthless beliefs.  Just because Bayle insists on the primacy of right praxis over right doxa, this does not imply that all opinions are equally good.  This is consistent with Bayle’s position that there is good reason to condemn false religious beliefs and to maintain orthodox beliefs.  What is most unique about Bayle, however, is his redefinition of the essence of religion: what is most important is not right belief, but right action.  Right action requires right reason, and right reason requires religious toleration.

6. References and Further Reading

a. Primary Sources

  • Bayle, Pierre.  Correspondance de Pierre Bayle. Eds. Elisabeth Labrousse & Antony McKenna.  12 vols.  Oxford, 1999-2015.
    • A monumental assembly of Bayle’s correspondence from February 1662 onward.  Projected to extend to 20 volumes.
  • Bayle, Pierre.  Dictionnaire historique et critique, par M. Pierre Bayle.  Amsterdam, Leyde, La Haye; 1740.  5th Edition, 4 vols. in-folio.
    • The work for which Bayle is most famous.  The fifth edition of 1740 is the easiest to access online, at the University of Chicago’s ARTFL project (https://artfl-project.uchicago.edu/content/dictionnaire-de-bayle), but the definitive version is the second edition of 1702, which is the first to include the “Clarifications” as appendices.
  • Bayle, Pierre. Historical and Critical Dictionary, selections. Trans. & ed. by Richard Popkin.  Indianapolis: Bobbs-Merrill, 1965.
    • The standard contemporary edition of Bayle’s Dictionary in English, though unfortunately it includes only a small fraction of the original.
  • Bayle, Pierre.  Œuvres diverses de M. Pierre Bayle, professeur en philosophie et en histoire à Rotterdam.  La Haye/The Hague, 1727-31; Hildesheim, 1964-68.  4 vols, in-folio; Vols V.1 & V.2: Hildesheim: G. Olms, 1982-1990.
    • The standard edition of Bayle’s corpus (not including the Dictionary); it includes all of Bayle’s published works, as well as some fragments of correspondence.
  • Bayle, Pierre.  A Philosophical Commentary on These Words of the Gospel.  Eds. J. Kilcullen & C. Kukathas.  Indianapolis: Liberty Fund, 2005.
    • The standard contemporary edition of Bayle’s Philosophical Commentary in English.  The translation is an amended version of the first English translation in 1708.

b. Secondary Sources

  • Bost, Hubert.  Pierre Bayle.  Paris: Fayard, 2006.
    • The definitive contemporary biography of Bayle.  In French.
  • Brush, Craig B.  Montaigne and Bayle: Variations on the Theme of Scepticism.  The Hague: Nijhoff, 1966.
    • An early and thorough treatment of Bayle’s skepticism.
  • Delvolvé, Jean.  Religion, critique, et philosophie positive chez Pierre Bayle.  Paris: Alcan, 1906.
    • The beginning of twentieth-century scholarship on Bayle, defending a fundamentally proto-Enlightenment reading of Bayle.  In French.
  • Hickson, Michael W.  “Theodicy and Toleration in Bayle’s Dictionary” Journal of the History of Philosophy 51 (1):49-73 (2013).
    • A rigorously argued, meticulously detailed treatment of the relationship between Bayle’s position on theodicy and his defense of religious toleration.
  • Irwin, Kristen.  “Which ‘Reason’?  Bayle on the Intractability of Evil,” in New Essays on Leibniz’s Theodicy, eds. Larry Jorgensen & Samuel Newlands (Oxford University Press, 2014), 43-54.
    • A contextually sensitive account of Bayle’s position on theodicy.  It argues that Bayle’s final position on theodicy contains the resources to reply to Leibniz’s objections.
  • Irwin, Kristen.  “Bayle on the (Ir)rationality of Religious Belief,” Philosophy Compass 8:6 (2013), 560-569.
    • An exposition of Bayle’s treatments of the rationality of religious belief.
  • Kilcullen, John.  Sincerity and Truth: Essays on Arnauld, Bayle, and Toleration.  Oxford: Clarendon Press, 1988.
    • A masterful treatment of Bayle’s arguments defending religious toleration.
  • Labrousse, Elisabeth.  Pierre Bayle: Hétérodoxie et rigorisme.  Paris: Albin Michel, 1996. 2nd ed.
    • An especially thorough treatment of Bayle’s thought by the premier Bayle scholar of the twentieth century.  In French.
  • Lennon, Thomas. Reading Bayle.  Toronto: University of Toronto Press, 1999.
    • The definitive treatment of Bayle’s thought in English.  It argues that Bayle’s thought is deeply and irreducibly anti-systematic in nature.
  • Lennon, Thomas.  “What Kind of a Skeptic Was Bayle?” Midwest Studies in Philosophy XXVI (2002), 258-279.
    • An exceptionally clear taxonomy of the various senses in which Bayle has been thought to be a skeptic.
  • Maia Neto, Jose R.  “Bayle’s Academic Skepticism,” Everything Connects: In Conference with R.H. Popkin, eds. J.E. Force and D.S. Katz.  Leiden: Brill, 1999; 264-275.
    • A compelling argument that Bayle’s skepticism is not Pyrrhonian, but fundamentally fallibilist and concerned above all with intellectual integrity.
  • Mori, Gianluca.  Bayle philosophe.  Paris: Honoré Champion, 1999.
    • The most contemporary treatment of Bayle as an ironic critic of religion, and as a moral thinker focused on “common notions”.  In French.
  • Popkin, Richard.  The History of Scepticism from Savonarola to Bayle.  New York: Oxford University Press, 2003.
    • The definitive history of fifteenth, sixteenth, and seventeenth-century skepticism in Europe.
  • Sandberg, Karl C.  At the Crossroads of Faith and Reason: An Essay on Pierre Bayle.  Tucson: University of Arizona Press, 1966.
    • A short, clear primer on the themes of faith and reason in the Baylean corpus.

Author Information

Kristen Irwin
Email: kirwin@luc.edu
Loyola University Chicago
U. S. A.

The Aesthetics of Classical Music

classical musicMusical aesthetics as a whole seeks to understand the perceived properties of music, in particular those properties that lead to experiences of musical value for the listener. It may also be understood more broadly as essentially synonymous with the philosophy of music, thus including issues of musical ontology, epistemology, ethics, and sociology. A specific area of focus within musical aesthetics is the aesthetics of classical music; it addresses questions relating to the aesthetic properties and aesthetic value of music in the Western classical tradition.

What aesthetic content does classical music have to offer? Does it consist simply in pleasing patterns, which have no meaning outside of the musical structures themselves? Can it express emotion, feeling, or other kinds of inner states? Does classical music offer insights into life not available through other art forms? Can it possess identifiable meanings, or significant conceptual, historical, or symbolic content? If so, how could this be achieved, given that its materials appear to be non-signifying in nature? These are some of the principal questions that concern the aesthetics of classical music.

After discussion of several important issues relating to classical music as an art form and an overview of influential discussions of the topic prior to the 20th century, this article addresses these principal questions through a discussion of four major topic areas in the aesthetics of classical music: musical understanding, musical form, emotion and expressiveness, and some further types of aesthetic content in classical music.

Table of Contents

  1. Classical Music as an Art Form
    1. Music and Inner Experience
    2. The Temporal Aspect of Music
    3. Classical Music as an Historical Tradition
    4. Musical Works and Musical Performances
  2. Historical Discussions
    1. Kant
    2. Schopenhauer
    3. Hanslick
    4. Gurney
  3. Understanding
    1. The Listening Experience
    2. Theories of Musical Meaning
    3. Theories of Musical Symbolism
    4. Theories of Musical Syntax and the Influence of the Cognitive Sciences
  4. Form
    1. Music as an Abstract Art
    2. Musical Formalism
    3. Beauty, the Sublime, and Sensuous Pleasure
  5. Emotion
    1. Association and Arousal Theories
    2. Resemblance Theories
    3. The Role Imagination and Metaphor
    4. The Expression of Negative Emotions
  6. Human Experience and Values
    1. Dilthey and Music as the Expression of Lived Experience
    2. Sartre, Adorno, and Music as a Social Force
    3. Contemporary Theories
  7. References and Further Reading

1. Classical Music as an Art Form

In the case of music, as in other arts, the term ‘classical’ indicates the presence of an established or long-standing tradition. While the roots of classical music extend back to Gregorian chant, three developments occurring in the 11th century are often regarded as marking the beginning of the classical tradition in western music. These are the developments of polyphony, the principles of order, and the establishment of musical pieces as compositions. The classical tradition is centrally defined by European art music composed during the Common Practice period, which encompasses Baroque, Classical, and Romantic music (roughly 1650-1900). It also includes Medieval, Ars Nova, and Renaissance art music, as well as non-European, 20th century, and contemporary art music that incorporates compositional practices that are recognized as being well-established in western art music. While the vast majority of compositions in Western art music unambiguously fall under the category of ‘classical music’, one can argue that, though there will be no decisive line, certain highly experimental or innovative pieces cannot be apart of an established tradition of composition and thus should not be considered ‘classical’.

In contrast to the aesthetics of popular music, the aesthetics of classical music has traditionally focused on aesthetic content that is strictly musical in nature, excluding any additional content conveyed through words, actions, visual displays, or any other non-musical elements. It has typically limited itself to inquiry into the aesthetic content in musical works that is available from music alone, considered apart from any non-musical elements.  Although there are clearly topics of significant interest in the additional aesthetic qualities of classical works that include non-musical elements (whether these be semantic, poetic, dramatic, or dance-related), most philosophers writing about classical music have been unwilling to venture into this territory. The focus on music as such in the aesthetics of classical music is due to the compelling philosophical questions generated by pure or ‘absolute’ music,  the complexity involved in considering music in combination with non-musical elements, and a desire to understand the art of music apart from any aesthetic content contributed from other sources. In keeping with the historical focus of the aesthetics of classical music on music as such, this article restricts itself to discussion of aesthetic content that is purely musical in nature and it does not address topics involving the combination of music with other aesthetic elements.

Several features of classical music as an art form play a central role in defining the areas of aesthetic inquiry that pertain to it. Three features in particular deserve attention. These are the unique impact classical music has on our inner experience, its temporal nature, and the central role played by the tradition of tonal harmony, even after its “collapse” at the beginning of the 20th century.

a. Music and Inner Experience

Classical music’s ability to engage and enliven our inner experience is a primary reason why it holds so much philosophical interest. What is it about classical music as an art form that enables it to connect so strongly with our inner life? Part of the answer would seem to lie in the fact that it is an auditory art. The perception of aesthetic content through hearing differs in fundamental ways from the perception of aesthetic content through vision, especially in the case of visual arts that make use of representation. One of the greatest differences between the two modes of artistic perception is that unless we are given rather clear guidelines, we do not interpret musical sounds as representations of objects. The preexisting ability to interpret and assign meanings to visual images does not automatically come into play when we hear musical sounds. It appears that music has the capacity to engage our aesthetic sensibility without also engaging the cognition of objects. This sensibility is linked in complex ways to inner experience, feelings, moods, and emotions.

In Western philosophy, discussion of the special power of music to shape our inner life predates Plato, as evidenced by the lively debates of the pre-Socratics on this topic. Plato himself devotes substantial attention to it in both the Republic and Laws, conceiving of music as an art that can bypass reason and penetrate into our innermost self, impacting the constitution of our character. To use Plato’s terms, music acts as a “charm” on our inner life, shaping this life to its pattern. Classical music in particular stands out among musical cultures for its ability to evoke compelling inner experiences in the listener. Curiously, the power of classical works to evoke such experiences appears to be heightened in many purely instrumental works despite the fact that such pieces possess no readily identifiable meaning.

b. The Temporal Aspect of Music

In addition to its distinctive characteristics as an art form perceived through hearing, music is, of course, always temporal. Many writers, Roger Scruton among them, suggest that music leaves our minds no choice but to move with it when we listen attentively. This activity of the mind is not merely the recognition of new sounds as they occur. The mind moves sympathetically with the motion it perceives in the music. Thus, another important aspect of classical music is that frequently our mental perception of the movement in the music is so strong that we can feel it almost like we feel physical motion.

Our minds also respond to the temporal nature of music in another way. It is the automatic response of the mind to follow the progress of what it hears and assimilate this content into its conception of the piece as a whole. Music’s temporal nature entails that we do not experience the whole work at once or in an order of our choosing, and that consequently the order of presentation is fundamental to our experience of the musical content. In most classical music, and perhaps all art music of the Common Practice period, we perceive purposive and goal-directed movement along with structures and relationships that develop over time, though the scope and complexity of such content varies greatly from piece to piece.  As listeners we recognize that an effort has been made to produce an aesthetic value-content, whether formal, expressive, or otherwise, worthy of appreciation or understanding. Due to this recognition, the assumption of an aesthetic attitude is a common practice in listening to classical music and thought to be an important means of enhancing our experience of the music as it unfolds.

c. Classical Music as an Historical Tradition

As an historical tradition, classical music gradually expands its artistic resources, from the practices of medieval polyphony, through the incorporation of new elements in the Renaissance, to the achievement of a conception of music and musical composition that is shared across Europe by the middle of the Baroque. The subsequent development of classical music during the Common Practice period is unique in the way that it preserves a strong continuity in compositional techniques while at the same time evolving continually as an art form. The late works from this period make use of the same basic musical materials (scales and chords) as the early ones: the diatonic scales, triadic functional harmony, primary organization around the dominant-tonic relationship, integration of vertical and horizontal dimensions, and so on. Early works differ from later ones in countless ways, but the fundamental musical materials and relationships do not change until the extended chromaticism of late romantic music begins to dissolve a sense of the tonic altogether. Later works differ from earlier ones primarily through creative innovations that are compatible with existing tonal system made by particular composers and through a gradual exploration and expansion of resources already implied in the tonal system itself. This gradual expansion within the context of a continuous tradition has significant implications for the expressive possibilities classical music possesses as an art form, allowing for the emergence of a repertoire of expressive compositional techniques that grows in effectiveness and scope as it progressively develops the potential that is inherent in tonal harmony.

The diverse compositional approaches developed in classical music in the early part of the 20th century introduce new questions for musical aesthetics. Many aesthetic theories based on analysis of music of the Common Practice period do not apply to compositions based on approaches divergent from those used by tonal harmony. This difference in aesthetic content applies to theories of meaning, form, and expressiveness. Most influential and contemporary philosophers of classical musical aesthetics focus almost exclusively on tonal classical music (including music that achieves a tonal center by means other than tonal harmony, as found in the music of Stravinsky, Debussy, and Bartok). Given that many of these theoretical perspectives do not apply to non-tonal music, the aesthetics of non-tonal classical music is an area that is in need of further development by the discipline.

d. Musical Works and Musical Performances

There are many philosophical questions surrounding the nature and definition of music and the ontological status of works of music. However, because these questions do not apply to classical music in particular, and because the discussion of these topics benefits greatly by comparisons between different musical genres and traditions, they are more appropriately addressed under the philosophy of music or musical aesthetics in general. As a result this entry offers just a brief summary of issues concerning the definition of music, musical ontology, and authentic performance of musical works.

General definitions of music most often focus a demarcation between music and the non-musical (largely due to the musical experimentalism prominent in western art music from the 20th century onward), and on ensuring that the diversity found in the world’s musical traditions is taken into account. These definitional questions are not pertinent to the aesthetics of classical music seeing as they focus on issues involving music outside of the classical tradition.

A similar situation exists with regard to musical ontology, though primary focus is given to works of classical music in some instances. One ontological issue pertaining centrally to classical music concerns the metaphysical nature of a work of music. Do musical works exist? If so, in what sense? With regard to musical ontology a Platonist would hold that a work of classical music is an abstract object, while a nominalist would hold that it must be understood solely in terms of particular objects that relate to it, such as the musical score. In contrast to all of these, anti-realists deny that musical works have any kind of real existence at all, though stopping short of discounting the question altogether, some anti-realists grant musical works a fictional status.

A second issue has to do with what constitutes an authentic performance of a piece. Is it sufficient to perform the right pitches and rhythms in the right order, or is more required to produce an instance of a given work? How essential to authenticity is the use of appropriate period instruments? Is a piano reduction of an orchestral score still an instance of the same work? Debate over these questions centers around which elements must be present in order for a performance to constitute an instance of the work in question. Even if a performance meets the criteria required for authenticity, there is a further question about its reception by the audience. Considering that the sensibilities of listeners continue to change, what is the significance of the fact that contemporary audiences cannot experience works as their original counterparts did?

Influential discussions of musical ontology and authentic performance as they pertain to classical music include Jerrold Levinson’s Music, Art, and Metaphysics, Lydia Goehr’s The Imaginary Museum of Musical Works, and Stephen Davies’ Musical Works and Performances.

2. Historical Discussions

Although discussion of topics relevant to Western musical aesthetics date back to the pre-Socratics, it is not until the 18th century that musical aesthetics takes shape as an inquiry focused on the understanding of perceived properties and capacities of the art of music. Starting with early theorists such as Mattheson and progressing through to thinkers such as Kant and Schopenhauer to later writers such as Gurney, historical discussions of musical aesthetics in Western philosophy are in fact discussions of the aesthetics of classical music. This is for the simple reason that they take music of the classical tradition as their subject matter.

Many of the early discussions of classical musical aesthetics revolve around the question of what music itself is capable of presenting to the listener, with much of the debate centering on the question of how and to what extent music can convey emotional content. German and English discussions of the topic, such as those of Mattheson and Hutcheson, are typically characterized by the view that music either stimulates psychological states directly or arouses them through imitation of ways that emotion is expressed, principally by the human voice. By contrast French theorists during this early period, such as Boyé and Chabanon, oppose the idea that music is capable of expressing emotion on the grounds that it lacks the tools required for successful imitation or representation. These early writers prefigure the debates between expressionist and formalist viewpoints in later discussions of the role of emotion in the experience of classical music (see Lippman for selected excerpts from these authors and further detail on early musical aesthetics).

a. Kant

Following early explorations of the topic the first major contributor to the aesthetics of classical music is Immanuel Kant in his Critique of Judgment. In applying his aesthetic theory to music Kant’s primary concern is with the question of whether, or to what degree, music belongs to the beautiful or to the pleasing arts. Kant maintains that aesthetic judgments consist in feeling disinterested pleasure in perceiving the form of purposiveness in an object, apart from charm, emotion, or any definite concept of what the object ought to be. He further claims that the perception of the form of purposiveness puts the imagination and understanding into harmony such that they are able to freely play with one another. This state of free play, so far as it can be felt in sensation, is the basis of the pleasure that we feel in response to beauty.

Kant considers the possibility that the imagination can apprehend a form in the musical composition which, when compared by reflective judgment to the faculty of referring intuitions to concepts, places the imagination in harmony with the understanding. In music this form, apprehended independently of any conception of an object, is purely a pattern of melodic and harmonic intervals. Harmonious agreement between the imagination and the understanding in the perception of the form of the composition would, provided that this is possible, result in the music being perceived as purposive for reflective judgment. It would also mean that music deserves to be classified among the beautiful arts.

Initially Kant identifies music as an object of pure aesthetic judgments, classifying “all music without words” as a type of free, rather than dependent, beauty. In his more detailed discussion of music in sections 51-54 of his Critique of Judgment, however, Kant seems to vacillate between the possibility that music belongs to the beautiful arts and the possibility that it falls short of providing a formal content suitable for aesthetic judgments and thus is merely a pleasant art. This ultimately leaves the question of which category music belongs to undecided. If music can qualify as beautiful, the composition as a form alone must constitute the object of aesthetic judgment. Factors such as the instruments used to play the composition and the quality of their tone may add charm to the piece and they may even enhance our experience of its beauty, but by themselves such factors do not constitute objects of aesthetic judgment.

While Kant explores the possibility that the composition as an abstract pattern of relationships may present purposive form and thus qualify as beautiful, he appears to conclude that the apprehension of purposive form in music is difficult at best. In the absence of the apprehension of such form, music is limited to the pleasant rather than the beautiful, consisting primarily in a changing play of auditory sensations. In this case, music can produce enjoyment and emotion, but is not a subject for pure judgments of taste. Apart from his enormous influence on the field of aesthetics as a whole, Kant’s writing on music has been influential for its emphasis on purely formal properties and its concomitant rejection of the value of emotion and sensuous qualities to the listening experience. As such, it clearly lays the groundwork for more explicitly formalist approaches in the 19th century.

b. Schopenhauer

Arthur Schopenhauer in The World as Will and Representation interpretes ‘will’ as the underlying metaphysical reality, as the thing-in-itself, and grants music the privileged status of presenting it. Departing from Plato and Kant, Schopenhauer denies that the underlying metaphysical reality is rational in nature. Instead, will is a blind urge whose continuous striving has no guiding purpose. Unlike the other arts, whose significance lies in the ability to capture “the permanent essential forms of the world,” thus limiting their reach to interpretations of the phenomenal realm, music expresses the will itself, directly and immediately, speaking the “universal imageless language of the heart.” While in music we experience a direct presentation of the will, nevertheless as thing-in-itself, the musical presentation of will, like will itself, is indescribable.

Despite his allegiance to Kant’s transcendental idealism, Schopenhauer’s aesthetics represents an important departure from Kant. Whereas Kant viewed the aesthetic value of music in purely formal terms as a play of patterns, Schopenhauer advocates that music is valuable for its direct expression of the continuous striving of the will. Thus, the contrasting views of Kant and Schopenhauer prefigure later debates between formalists and expressivists concerning the aesthetic properties of music.

c. Hanslick

In his influential treatise On the Musically Beautiful Eduard Hanslick argues for a strong version of aesthetic formalism that limits aesthetically valuable content to the audible analogue of a moving arabesque or kaleidoscope, differing from these only in that music “manifests itself on an incomparably higher level of ideality” and “presents itself as the direct emanation of an artistically creative spirit.” Hanslick rejects the view that music is capable of expressing emotions, holding instead that music consists purely in tonal forms that develop in time. In doing so he presents an early cognitivist account of emotions, holding that emotions are primarily defined by concepts. He claims that music is incapable of conveying the conceptual content needed to differentiate between specific emotional states. As a result, the aesthetic content of music is limited to a specifically musical kind of beauty that “consists simply and solely of tones and their artistic combination.” His conclusion is that the “representation of a specific feeling or emotional state is not at all among the characteristic powers of music.”

The production of an experience of motion is the aspect of music that is shared with emotion. Through dynamics, tempo, shape, and timbre, music can present auditory instances of qualities that accompany emotions, but no actual emotional content is present, since this would require music to convey concepts: “Music can, in fact, whisper, rage, and rustle. But love and anger occur only within our hearts.” As one might expect given his allegiance to a purely formal conception of musical value, Hanslick also rejects the idea that music as such is suitable for the representation of extramusical content.

d. Gurney

In the latter part of the 19th century Edmund Gurney developed an approach to musical expression based on Darwinian evolutionary theory. Gurney, preceded by Herbert Spencer, postulated a biological origin of music in the impulse to attract and court a mate. According to Gurney, music originates from the capacity that evolved in our ancestors to use sounds to arouse responses from potential mates and rivals. Given that it evolved in this way, music is directly connected to the arousal of our passions. This original connection to the passions and to sexual excitement is fundamental to music in all of its forms. Emotion in classical music is a sublimation of this original sexual excitement. Its origins do not, however, constitute a link between music and extra-musical values or interests. Gurney argues that music offers a profound and entirely self-contained pleasure, whose origins grant it a special connection to our inner experience. Gurney’s work addresses many other fundamental questions in musical aesthetics, including the nature of musical motion, the basic components of musical understanding (which Gurney believed to be melodic forms), musical beauty, and musical value. It is also the inspiration for a recent work by Jerrold Levinson on the nature of musical understanding entitled Music in the Moment.

3. Understanding

Following Gurney’s claims for the role of melodic structures in musical understanding, scholars have generally agreed that an account of the nature of musical understanding must accompany any comprehensive treatment of the aesthetic properties of classical music. Musical understanding in this sense refers to how specific musical structures combine to convey an intelligible sense to the listener. As a result, this establishes a basis upon which to make further claims about the formal or expressive content of music.

a. The Listening Experience

In contemporary discussions there is general consensus that when we experience classical music, we hear the pattern of sounds as an intentional object. That is, we hear the musical work in the form of an unfolding audible musical structure. The term ‘intentional’ in this context signifies that music exists for us in virtue of its being an object of our conscious focus. Hearing patterns of sounds as music is something we contribute as listeners, since it is perfectly possible for someone not familiar with a particular kind of music to fail to grasp its aesthetic qualities. In appreciating music we hear the sounds as musical elements relating to one another within an aesthetic framework as components of a work of art. This audible musical structure together with any additional attendant qualities such as timbre, dynamics and vibrato, is the object of appreciation that produces experiences of aesthetic value.  In Values of Art Malcolm Budd attempts a narrower definition that limits the aesthetic content of music to the work’s audible musical structure alone, leaving out of consideration timbral and performance-related aspects. More recently, multiple authors have presented arguments that these attendant qualities are significant aspects of the experience of aesthetic value. Regardless of these particular issues, there is a broad consensus that the experience of aesthetic value in classical music should not be considered separable from the listener’s experience of the audible musical structure of a work. It is this structure, perceived through active listening, that both contains the aesthetic content and produces the experiences of aesthetic value.

In perceiving the audible musical structure, our minds follow the succession of events and we grasp them aesthetically in relation to one another when we listen attentively. This activity of the mind is not merely the recognition of new sounds as they occur, but involves a sense of motion in the music. Given that the unfolding audible structures of classical music do not involve motion in a literal sense, the perception of motion presents a problem for the theorist.

In his pioneering treatise Sound and Symbol, Victor Zuckerkandl identifies our perception of motion as resulting from the directional tendencies present in tonal music. This includes the leading tone seeking to find resolution in or “move to” the tonic. Roger Scruton finds that while this observation is accurate, it does not capture the essence of musical motion. Scruton argues that motion must be understood as part of a system of indispensable metaphors involved in perceiving the music, and further that we perceive musical motion in spatial terms. Malcolm Budd argues that Scruton’s insistence on a spatial conception of musical movement is unnecessary and that a better approach would be to characterize music in terms of a purely temporal Gestalt, limiting music to movement in time and eliminating the need for metaphor. Scruton’s reply is that the concept of merely temporal movement is itself metaphorical in nature and that foundational metaphors such as spatial movement are also indispensable because they connect music to human experience. This allows, he claims, for the development of a complete account of music’s meaning and value.

Another topic of debate concerns the extent to which the perception of larger scale structures plays a role in musical understanding and appreciation. In agreement with the emphasis placed on the value of larger scale formal structures by Heinrich Schenker and Leonard Meyer, Peter Kivy emphasizes the architectonic aspects of the listening experience. He argues that large scale structural patterns and relationships constitute an important aspect of the expressiveness and aesthetic value experienced by the educated listener. In The Aesthetics of Music Roger Scruton also advocates for the importance of these aspects. He finds the comparison between the methods of music composition and architecture to be an apt one, though he rejects the Schenkerian claim that the surface structure of a piece is generated by its underlying large scale plan. In Scruton’s view musical understanding consists in perception of the composer’s development of the fundamental linear and vertical relationships present in tonal music, which he describes as an “order of polyphonic elaboration” that is inherent in the practices of triadic harmony. Inspired by Gurney, Jerrold Levinson disagrees, arguing instead for ‘concatenationism’, the view that basic musical understanding, together with the greater part of music’s aesthetic value, does not require perception of large scale formal relationships and that “the core experience of a piece of music is a matter of how it seems at each point.”

Related to the question of the value of perceiving larger scale formal patterns in classical music is the question of whether formal training or a certain level of education is required for the appreciation of classical music. Though scholars agree that a certain amount of acculturation is required for its understanding and appreciation, there is debate concerning the extent to which education and musical training can enhance the listener’s ability to perceive the aesthetic content of the music. Those such as Kivy who locate the primary aesthetic content of classical music in the musical form and the purely musical relationships that exist within it tend to argue that a higher level of education or acculturation is needed. On the other hand, others such as Levinson locate the primary aesthetic content in expressive qualities and in the way the music unfolds from moment to moment. They vary in their assessment of the aptitude required of the listener depending on their conception of what musical expression consists in and how it occurs.

b. Theories of Musical Meaning

Recognizing that we identify a pattern of sounds as an intentional object aids in understanding how we come to perceive the sounds produced as a form of art. However,  this does not address the question of how an unfolding musical structure produces meaningful aesthetic content. An account of musical understanding requires an explanation of how the patterns and relationships present in the musical structure produce meaning for the listener.

In The Language of Music, Deryck Cooke seeks to show that certain recurrent patterns present in the music of the Common Practice period have specific emotional meanings, making it possible to construct a basic emotional vocabulary of classical music that is composed according to the principles of Common Practice tonality. Cooke further extends his analytical approach to defining emotional content contextually. If correct, his insights would establish a basis for understanding the emotional content of most classical music. Malcolm Budd and Roger Scruton have objected to Cooke’s theory on multiple grounds. They argue that it is inappropriate to construe music as a language because music lacks both a syntactic and a semantic structure, and that even if the claim to be a language is taken in a metaphorical sense, the reappearance of similar musical patterns in similar expressive contexts is not a matter of meaning, but of conventionally tested appropriateness to the context in question. Another important objection focuses on Cooke’s claim that composers use music’s vocabulary of emotions to convey the emotions that they felt when composing the work, sometimes labelled the ‘expression-transmission theory’ or simply the ‘expression theory’ of musical expression. Budd points out that by locating the value of the experience in reception of the composer’s emotions, the expression-transmission theory removes the aesthetic value from the work itself, conceiving of music as a tool for arousing the emotions of the composer in the listener. In reality, he argues, we experience aesthetic value primarily in the experience of listening to the music itself. It would misrepresent our motivation for listening to say that experiencing the emotions that the composer experienced could be a substitute for the experience of the specific aesthetic qualities found in a musical piece.

Following Cooke, a comprehensive and detailed attempt to understand how tones and rhythms produce an experience of meaningful content was made by Leonard Meyer in Emotion and Meaning in Music. Meyer, whose basic approach was further developed by Eugene Narmour, makes use of information theory in developing the thesis that a great deal of what we appreciate in classical music is the result of a sense of expectation produced by antecedent-consequent relationships. According to Meyer, a sequence of tones has musical meaning if it points to or sets up the expectation of other tones that will follow. Meyer calls this type of meaning ‘embodied meaning’, as distinguished from ‘designative meaning’ which consists in a culturally established references to some extramusical content. Largely due to his reliance on information theory, Meyer defines embodied meaning purely in terms of expectation. It is generated by directionality inherent in the diatonic scale (leading tone-tonic relationships in melodies and harmonies) as well as by expectation that is built on the listener’s familiarity with traditional forms. One of the most important instances of expectation is the perception of an incomplete pattern, leading to a desire for its fulfillment on the part of the listener.

Finding Meyer’s concept of embodied meaning to be too one dimensional and seeking to restrict musical meaning to the audible musical structure itself (that is, to the exclusion of what Meyer described as designative meanings), Budd offers the concept of ‘intramusical meaning’.  This concept, Budd suggests, consists in the ensemble of musical features and relations present in an audible structure as perceived by an educated listener. In developing the concept of intramusical meaning, Budd is seeking to emphasize the abstractness of music as an art form. He wishes to establish that perceiving the audible structure of a work and the relationships that this structure contains, its intramusical meaning, is a necessary precondition for any further interpretation of a musical work. As Budd conceives of it, intramusical meaning is the most basic and fundamental characteristic of a musical piece. Budd points out that Meyer’s concept of embodied meaning is clearly does not account for the diversity of feelings generated in our experience of music of the Common Practice period. Intramusical meaning encompasses all significant relationships perceived by the listener, so it does not restrict musical meaning to a specific process, such as that of antecedent-consequent relationships. At the same time, Budd acknowledges that Meyer’s concept of embodied meaning does account for the production of responses such as anticipation, frustration, confusion, surprise, and satisfaction, with varying degrees of intensity. A potential criticism of Budd’s concept of intramusical meaning is that it places all musical meaning under a single all-encompassing category and gives no account of how specific types of structures or relationships lead to specific musical meanings.

c. Theories of Musical Symbolism

Inspired by Ernst Cassirer’s Philosophy of Symbolic Forms, in Philosophy in a New Key Suzanne Langer interprets musical understanding to consist in grasping a symbolic content rather than in the perception of discrete intramusical meanings. Langer offers a theory of musical works as “unconsummated presentational symbols.” As such, each piece of music symbolizes the form, but not the content, of a feeling. Unlike words, presentational symbols are understood only through seeking to grasp the whole, the elements of which must be interpreted in relation to each other. Pictures are presentational symbols, as are works of music. The main function of musical compositions is to symbolize feelings. Music is an unconsummated presentational symbol because it only reflects the morphology of feeling, not the content of specific feelings. If true, Langer’s theory entails that we can understand a given work as a formal abstraction of an emotional experience. In evaluating Langer’s views, Roger Scruton argues that because Langer’s unconsummated symbols do not have a specific meaning, reflecting instead only the morphology of feelings, her theory reduces to the claim that musical processes have a formal resemblance to emotional processes.

Another significant attempt to speak of music in symbolic terms was made by Nelson Goodman and given further musical focus by Monroe Beardsley. Arguing that works of art symbolize through predication rather than denotation, Goodman develops the concept of ‘exemplification’ to explain artistic expression. An instance of exemplification is one in which a predicate attaches to something which also refers to the predicate, as in a swatch of cloth from a tailor, which “exemplifies only those properties that it both has and refers to.” The difference between everyday instances of exemplification and exemplification in art is that in art the referential component is metaphorical rather than literal in nature. In applying Goodman’s concept of exemplification to music, Beardsley offers the example of a sonata whose first movement has a diffident, indecisive character. Given that it is displayed by the sonata and also plays a significant role in the piece as a whole, diffidence is an instance of musical exemplification.

d. Theories of Musical Syntax and the Influence of the Cognitive Sciences

Several notable authors sought to offer an account of musical meanings by analyzing music in terms of a musical syntax. Influenced by the structuralism of Ferdinand de Saussure, Nicholas Ruwet and Jean-Jacques Nattiez who argue that music does possess a syntax and therefore can be interpreted and understood similarly to any other system of signs. A prominent criticism of this approach argues that such an attempt will necessarily be unsuccessful because unlike the case of natural language, it does not appear to be possible to define musical structures in terms of a generative grammar. Fred Lerdahl and Ray Jackendoff seek to address precisely this issue in A Generative Theory of Tonal Music. A key issue here is whether it is possible to establish a relationship between deep structure and surface structure in music by providing transformation rules for the generation of surface structures from deep structures. In seeking to establish that music possesses a generative syntax Lerdahl and Jackendoff put forward the ‘reduction hypothesis,’ which they draw from cognitive science. This hypothesis states that we as listeners seek to organize all musical events within a piece into a “single coherent structure, such that they are heard in a hierarchy of relative importance.” Though the attempt to identify syntactic structures in music has been influential, most contemporary theorists would deny that music possesses a syntax in any robust sense.

The emphasis placed by Lerdahl and Jackendoff on how music is organized by our brains while listening shifts the focus from meaning in the music to the cognitive processes by which we understand it (though of course the two are related and both need to be accounted for). This shift makes salient the importance of supplementing philosophical investigations of musical understanding and experience with scientific approaches. Although this entry does not consider specific scientific investigations into musical cognition, it is important to acknowledge the work in areas related to understanding and experiencing music that is being done in the cognitive sciences of psychology and neuroscience. Seeing as musical understanding and experience necessarily relate to cognitive structures and processes, approaches undertaken within various subdisciplines of psychology and neuroscience offer increasingly illuminating investigations into the topics of musical meaning and musical understanding.

In assessing the potential contribution of these fields, Tom Cochrane argues that studies in psychology and neuroscience can provide additional support for one theory of our experience of music over another, as well as in some cases allow us to reframe and synthesize traditionally distinct positions. He also acknowledges the limitations of many scientific studies, which, he suggests, points to the value of an interdisciplinary collaboration between philosophy and cognitive sciences including psychology and neuroscience. A further consideration in support of scientific investigations of musical experience is the fact that philosophical authors commonly make reference to their own personal experience of music as a partial justification for their views.  Scientific research into musical cognition also potentially has value for this reason. It may be a way of providing additional support for an otherwise highly subjective component of philosophical theories.

4. Form

Accounts of understanding classical music address the question of how patterns of sound generate meaning for the listener. As such, they have to do with the unfolding of these patterns in time during the listening experience and with the listener’s perception of relationships between musical ideas in the piece. Insofar as they focus on the process of understanding, they only partially address the more general question of what kind or kinds of aesthetic content a musical structure is capable of conveying. Is the aesthetic content of classical music limited to appreciation of patterns and relationships present in the formal structure, or does the musical form relate in some significant way to our experience outside of music? Is the aesthetic experience of this music primarily or wholly intellectual in nature as the cognitivist would claim, or does the listener experience the content in emotional terms through the music’s expressive qualities? The fact that music unaided by words is generally agreed to possess meaning of some sort, but does not appear to possess adequate tools for either representation or signification makes answering these questions especially challenging.

The question of whether music means or expresses anything beyond itself is present in musical aesthetics from the time of the earliest discussions of the topic in the first half of the 18th century. Kant makes the formalist idea of limiting content to form prominent by virtue of his conception of aesthetic beauty as purposiveness without a purpose, or as the form of purposiveness. Hanslick further develops this train of thought in claiming that the aesthetic content of classical music is best understood through the analogy of a moving arabesque. Meyer emphasizes the fundamental importance of formal structures, though he acknowledges extramusical content as a legitimate aspect of some music. Influential contemporary accounts of the aesthetic value and content of the formal structure as such have been offered by Malcolm Budd, Peter Kivy, and Nick Zangwill. Underlying each of these accounts is the formalist intuition that the aesthetically significant qualities of music as an art form result from appreciation of aspects of the musical structure itself as a structure and that music, as such, has no meaning beyond the patterns and relationships present in it. While Budd ultimately appears to reserve judgment about the possibility that music could possess emotionally expressive or extramusical content in addition to the purely musical content that he advocates for. Kivy and Zangwill take a stronger stance, arguing that aesthetically significant content in music is strictly musical in nature.

a. Music as an Abstract Art

In Values of Art, Malcolm Budd characterizes music as the “art of uninterpreted sounds,” arguing that music is essentially an abstract art and that the essence of music is the audible musical structure perceived by the listener. Budd does not deny that music can contain other elements and serve other purposes. For example, when a musical instrument, passage or motif is used to signify something extramusical, or when a musical work in some fashion represents extramusical things or events, or when music is combined with other art forms. His claim is that such elements in music are not proper to the art; that they are not part of music as such. For Budd, the musical content in music is present in an abstract audible structure whose meaning is not determined by meanings in or references to the external world. In this way, music represents nothing, makes no reference to anything, and is not about anything other than itself. Budd restricts what is essential to understanding music to the perception of the audible structural patterns present in a piece and their musically significant relations with one another. All other content is excluded.

Budd calls this form the ‘musical structure’ of the piece. For Budd, music is abstract in the sense that it does not depend for its success as an art form upon a referential relation to other areas of our experience or knowledge, whether this reference be by means of representation, imitation, signification, or by some other technique that referentially links musical sounds to things in the outside world or our experience. It is important to note that in keeping with the majority of those writing in this area by placing emphasis purely on musical content, referential meanings are not given serious consideration as aesthetically significant to music as an art form. Music may possess a variety of referential meanings, from the imitation of extramusical sounds, to culturally established meanings attached either to specific types of sounds or melodies, to imitations of content supplied by a program or accompanying words. Most writers would argue, however, that such referential meanings are not proper to the aesthetic content of classical music, given that they rely for their specification on extramusical elements such as words and cultural conventions. For Budd, the musical structure alone constitutes all of the musically significant content of the music. Other elements may be added for artistic enhancement. Examples of structural elements as Budd conceives of them would include melody, rhythm, and harmony, as well as other aspects of the music judged by the listener to be musically significant, such as clearly identifiable formal patterns, relations between parts (including contrapuntal motion, imitation, etc.), harmonic texture (polyphonic, homophonic, heterophonic, etc.), variations in the number of parts and in performing forces, and the like. Audible aspects of the music including the type and quality of instrument, the quality of the performer’s technique, and the artistic choices that the performer makes are secondary to what is contained in the music apart from these factors.

In defining music as the art of uninterpreted sounds, Budd locates the strictly musical content of music first and foremost in the listener’s perception of relationships between musical structures. Hearing the music in a work consists in perceiving the relatedness of structural features. Music is an unfolding of patterns and relationships in time. Hearing music as such is primarily a dynamic experience. That is, an experience of the flow of energies generated by the temporal unfolding of pitch relationships and rhythmic patterns.

b. Musical Formalism

The claim that music is fundamentally an abstract art may be taken to mean that music contains nothing other than sounds and their relations to one another. In other words, it may be taken to mean that music possesses only formal content such that any content other than this formal content is of secondary importance and an optional addition on the part of the hearer, and hence, not part of music itself. An account of this sort would allow that musical forms can possess emotional content as an expressive property grasped through intellectual perception and that musical forms can produce an affective state in the listener in response to aesthetically significant qualities such as beauty or impressiveness (as with Gurney). However, it would deny that music expresses emotions in any normal sense of the term. Musical formalism holds instead that all aesthetic content in music is purely musical in nature. For this reason, it also denies that music is capable of conveying human experience or values, as well as any kind of broader conceptual content relating to human life.

Peter Kivy, a prominent advocate of this approach, argues that in essence music is “a quasi-syntactical structure” that is understandable solely in musical terms having “no semantic or representational content, no meaning, making reference to nothing beyond itself.” He offers a sustained argument for this viewpoint in Music Alone and develops his discussion further in New Essays on Musical Understanding. It should be noted that in advocating what he describes as ‘musical purism’, Kivy does acknowledge that music can possess some expressive features, provided that these features are non-representational, non-referential, and possess no meaning other than a purely musical one. Kivy suggests that while music neither expresses emotions nor arouses them in us, it can possess expressive properties through resemblance, much in the same way, to use Kivy’s example, we recognize sadness in the face of a St. Bernard.

A centerpiece of Kivy’s argument is his ‘contour theory’ of musical expressiveness, first articulated in The Corded Shell. Kivy argues that the experience of expressive content in music consists, not in the emotional experience of such content, but instead in the recognition of emotional qualities through a similarity between musical shape and the characteristic shape of utterances or bodily gestures. We make this association, according to Kivy, because we are psychologically determined to animate what we perceive and interpret it in human terms. The perception of emotion in music is thus public and objective in the same way it is in people.

Kivy identifies some instances of expressive content that cannot be explained by his contour model, such as our experience of the respective qualities of the major and minor modes. He argues that these instances, whatever their origin, are established by convention and hence have the same objective character as those resembling human behavioral expressions of emotion. While acknowledging the strength of Kivy’s perspective, Mark DeBellis suggests that an appeal to resemblance via contour lacks explanatory power, since to say that we perceive both music and speech or gestures as having the same expressive quality is merely to restate the problem of expressive character. DeBellis also points to the possibility of music resembling human actions that cause emotion rather resembling the expression of the emotion itself, as in satisfaction resulting from the perception of struggle followed by resolution.  He questions whether Kivy’s claims about the conventional nature of the major and minor modes can be verified. More recently Kivy has modified his position to one of “enhanced formalism,” holding that pure instrumental music is a “black box” regarding the question of how it comes to possess expressive properties and suggesting that the important question is instead that of understanding the role that these properties play in the formal structure.

Following a similar conception of music’s aesthetic content to that of Kivy, and in agreement with Scruton concerning the metaphorical nature of our descriptions of musical qualities, Nick Zangwill argues for the ‘aesthetic metaphor thesis’. This thesis holds that, except in exceptional cases, emotion descriptions of music are metaphorical descriptions of music’s aesthetic properties. Thus, just as we say without controversy that a passage is delicate, in the same metaphorical manner we can also describe a musical passage as serene. Zangwill acknowledges that we do have intensely valuable aesthetic responses to some works of music, but denies that these responses are emotional in nature. The mistake, according to Zangwill, is to take our metaphorical descriptions literally and confuse the feelings involved in experiencing music with emotions. In agreement with Kivy, Zangwill holds that absolute music cannot evoke ‘garden variety’ emotions and argues instead that in listening to music, we experience specifically aesthetic feelings which share some, but not all of the features found in actual emotional experiences.

c. Beauty, the Sublime, and Sensuous Pleasure

Regardless of the stance taken on whether or not music is capable of expressing emotions or other types of extramusical content, there is universal agreement among theorists that classical music offers unique and highly valuable experiences of musical beauty. Historically, the predominant tendency has been to limit musical beauty to the perception of relationships existing in the formal structure of the work, excluding its sensuous qualities. The most common type of musical beauty attributed to classical music is found in melody. The great majority of individually identifiable melodies that we describe as beautiful possess certain characteristics that are easily recognizable. These include a predominantly conjunct motion, graceful contours, elegance of design, a duration such that the whole can be grasped in the listener’s immediate awareness, a sense of arrival or return toward the end of the melody, a moderate to slow tempo, and a song-like quality in the production of the sound and phrasing (such as bel canto style, for example). The details of style evolve over time, but these general characteristics hold for beautiful melodies throughout the Common Practice period and beyond, as well as for instances of melodic beauty that predate Common Practice tonality. Musical beauty in the sense of patterns pleasing to the intellect and imagination may also be found in the perception of larger scale musical forms. Assessment of the significance of these vary depending on the weight granted to architectonic features in the musical experience. At the very least, certain readily perceivable formal structures such as those present in canons and harmonic ostinatos can be included uncontroversially in standard aspects of musical beauty in classical music. Well-crafted ‘counterpoint’ is a third commonly identified type of musical beauty. At slower tempos and especially in lower registers counterpoint is also acknowledged by many theorists to contribute to perceptions of musical profundity.

Closely related to musical profundity is experience of the sublime. In classical musical aesthetics, as with other arts, the sublime is usually taken to refer to evocation of that which is beyond human comprehension. In keeping with Edmund Burke’s influential analysis, the experience of sublimity in classical music is most often associated with feelings such as awe, astonishment, obscurity, and terror. Musical passages have been considered to evoke the sublime through qualities. These qualities include complexity, whether of overall design or of interaction between musical elements, emotional expression and mood, which may involve intense conflict or turbulence, but could also be present as transcendence or otherworldliness, and creative power either from an impression of the composer’s creative power in the scope or impressiveness of the work or through qualities evoking creativity in the work itself (as in a fantasia).

In contrast to the traditional focus on formal qualities, classical musicians themselves, as well as contemporary listeners to classical music, would almost universally include sensuous qualities as important contributors to musical beauty and sublimity. Indeed, a primary goal for the classical musician is to develop beauty of tone. Additionally timbres and coloristic effects play an increasingly important role in classical compositions starting in the latter part of the 19th century, as seen in musical impressionism and minimalism, as well as in the expanded palette available through the use of greater and more varied performing forces from the Romantic period onward. For these reasons it seems difficult to deny that tone quality and the listener’s experience of both successive and simultaneous combinations of timbres should be possible objects of musical beauty and contributors to the experience of musical sublimity. In the case of sublimity, dynamics and texture would also seem to have an important role, as would, in some instances, articulation and attack. A further question would be the extent to which virtuosic elements and displays of musical virtuosity by soloists constitute or enhance beauty or sublimity in music. A common analogy notes that such displays are the auditory equivalent of fireworks.

5. Emotion

Can music possess expressive content in a more substantial way than in the intellectual recognition of resemblances to human expressive behavior in purely structural qualities, as the cognitivist would suggest? Theories addressing this question can be classed into several categories, as follows. Transmission-expression theories such as Deryck Cooke’s claim that the emotions experienced in the music are those experienced by the composer. Arousal theories claim that the music’s expressiveness consists in its ability to move the listener to have an affective response. Resemblance theories claim that musical expressiveness lies in perception of a similarity between the way the music sounds and the way emotions feel. Mirroring response theories claim that expressiveness lies in the music itself rather than originating in the composer or being located in the listener. Nevertheless, these theories claim that listeners often mirror the emotional qualities that the music expresses, though their doing so is not required for the music to be considered expressive. Imaginative response theories claim that we experience music as expressive by imagining that the emotions we perceive in it belong to an indeterminate persona (since the music itself cannot be the possessor of emotions). Accordingly, to hear emotion in music is to hear it as the expression of feelings by an imagined individual. A related approach emphasizes the metaphorical nature of expression without attributing it to an imagined persona. Sympathy theories emphasize our sympathetic engagement in the music and corresponding enhanced recognition of its qualities.

Although the literature is less extensive, theorists have also examined the presence and role of moods in classical music. ‘Mood’ here refers to the feeling of a state or states that persist over a significant period of time and have the capacity to color our attitude toward all of the musical content that we hear while they are being felt. It is generally assumed that moods differ from emotions not only in that they apply globally, but also in their lack of an intentional object. Although it is difficult to claim that moods contain much expressive content themselves, they may set the stage for the experience of more specific kinds of expressive content. Thus, a joyous mood might set the stage for feelings of triumphant arrival, a somber one for mourning and loss. Noel Carroll proposes that moods in music can offer a solution to the debate between formalists and arousalists, conceding to the formalist that music lacks the tools to represent the kinds of objects emotions require while granting to the arousalist the point that music can arouse “affective states that are objectless, global, [and] diffuse.” Peter Kivy disagrees, claiming that while there are certainly experiential differences between moods and emotions, they are identical in regard to how music can be expressive of them.

a. Association and Arousal Theories

Leonard Meyer combines his account of musical meaning with a theory of affective arousal. Building on the theory of emotions developed by John Dewey (whose aesthetics offers illuminating applications to classical music even though it does not consider classical music specifically), Meyer claims that emotion is evoked “when a tendency to respond is inhibited.” This situation occurs in classical music in innumerable instances when composers establish expectations, then delay the satisfaction of these expectations, as in delayed arrival on the tonic, or failure to complete a pattern that has been initiated. These examples, and countless others like them found throughout the fabric of classical compositions, trigger an affective response by establishing an expectation of fulfillment, then inhibiting that expectation. Meyer claims that this affective response can be either undifferentiated, in which case only a “feeling tone” is present (perhaps akin to purely musical feelings), or differentiated into a specific emotion by the listener in a process of imaginative association. Meyer’s theory is thus an arousal theory in its conception of affective response and an association theory in its account of the experience of specific emotions by the listener. However, as Malcolm Budd and numerous others have observed, in order to be aesthetically significant expressive content must be a product of properties perceived in the music itself. Consequently, expressive content cannot be the product of an association between the music and some extramusical content that defines or shapes our experience of it.

More recently Jenefer Robinson has advanced another version of the arousal theory, arguing that music has the ability to excite physiological arousal directly in the listener. According to Robinson, the listener attaches an emotional label to the state of arousal after this arousal takes place. Making a claim similar to that of Meyer in his theory of emotional differentiation, this label is governed by the context that the listener brings to the listening experience. Following the contributions of Robinson, many theorists now accept that arousal plays a role in the experience of classical music, even if it is only part of a more complete account. Peter Kivy figures as an exception by taking a formalist point of view, suggesting that to interpret our inner state as an emotional one after the fact is optional at best, and furthermore, is not the type of listening that appreciates what music as an art form has to offer.

b. Resemblance Theories

In his Music and the Emotions, Malcolm Budd reviews and rejects many of the prominent theories of musical expressiveness.  In his Values of Art, Budd offers an argument for a “basic and minimal concept” of what the expression of emotion in music consists in. According to Budd, the expression of emotion in music amounts to hearing the music as sounding the way an emotion feels. Thus, the core element in the emotional expressiveness of music is the listener’s perception of a likeness between what is in the music and the experience of a particular emotion. In Budd’s view, this basic “cross-categorical likeness perception” must underlie any account of the expression of emotion in music. However music is expressive of emotion, the expression of emotion must always rely at bottom upon the perception of the music as sounding like the way emotions feel. Budd goes on to identify three likely “accretions” to this “basic and minimal account,” but does not commit himself to any one view. First, the music may induce the feeling whose likeness is perceived. Second, the perception of a likeness to emotional experience may be accompanied by listeners imagining an occurrence of the perceived feeling in themselves. Third, instead of imagining experiencing the feelings that are perceived in the music, the listener may imagine that the music is an instance of these feelings rather than the feelings of any specific individual.

In The Aesthetics of Music Roger Scruton classifies Budd’s idea of a cross-categorical likeness perception, Langer’s conception of music as an unconsummated presentational symbol, and Kivy’s contour theory as versions of what he calls ‘the resemblance theory’.  Scruton argues that all versions of the resemblance theory will be unsatisfactory for two reasons. First, resemblance theories confuse expression with the means by which it is achieved (as with other arts such as poetry, music does not resemble what it expresses). Second, if resemblance involves recognizing expression without requiring that we experience something of value as a result of it (as Kivy would have it), then successful expression may occur in an aesthetically uninteresting piece of music and it is unclear why the musical presentation of expression would have any special value.

Approaching the problem of expressiveness from another angle, Stephen Davies endorses a contour model similar to Kivy’s, but also emphasizes the centrality of the listener’s response to the perceived expressive properties. Thus, experiencing expressive content involves a ‘mirroring response’ in which the listener experiences an emotion similar to that perceived in the musical structure, though the music itself is not thought to arouse this emotion directly or in a mechanical way.

In his recent Critique of Pure Music James Young advances versions of both arousal and resemblance theories as components of his anti-formalist position, Arguing in a manner similar to Budd, but in greater detail, he claims that that music arouses emotions through the resemblance the listener perceives between the experience of music and the experience of human behavior expressive of emotions. Identifying this process as the result of a ‘cross-domain mapping,’ Young follows an approach similar to that recommended by Tom Cochrane in drawing on empirical studies of listener responses as well as theories of brain function.

c. The Role Imagination and Metaphor

Jerrold Levinson focuses on the imaginative contribution of the listener in offering an account of hearing music as drama. Heard as drama, music consists in the interplay of forces within a piece, energies or impetuses within the piece whose interaction involves qualities such as tension, suspense, assertion, struggle, and conflict. Levinson suggests that when we hear music as drama, we imagine the dramatic actions and motivations to belong to indeterminate personae or person-like agents. He acknowledges that this way of listening adds an optional layer of content not strictly derivable from the music itself.

Aaron Ridley takes an approach similar to Levinson in regards to the imagination of indeterminate personae, but places special emphasis on the melismatic gesture in classical music as a primary vehicle of emotional expressiveness. Ridley argues that the melismatic gesture “resembles items in the expressive repertoire of extramusical human behavior, either physical or vocal,” thus allowing the music to present states of feeling which the listener experiences through a sympathetic response to the music. Following the contributions of Levinson and Ridley several theorists, Scruton among them, have suggested that the introduction of an imagined persona is unnecessary and that the musical entities themselves qualify as dramatic agents interacting with one another.

Much of western classical music from the Common Practice period can easily be characterized as inherently dramatic in nature, involving development, struggle, and resolution, due to its fundamental reliance on the tonic-dominant relationship. This relationship allows for multiple large and small scale instances of motivic development, of tension and resolution, departure and return, and movement and rest to occur within the context of a single piece. The tension found in the dominant seventh, as well as in other chords that function similarly, places the listener in a state of suspense and instills a desire for resolution. Tonal harmony exploits the dynamic qualities of chords within a given harmonic context to create tension, suspense, expectation, and surprise. It is worth noting that conceiving of music as a dramatic art would seem to shift the emphasis away from the value of a particular content in the music itself and toward the experience of dramatic qualities by the listener. Provided that we give ourselves over to it fully, a highly dramatic work may allow us to experience a form of catharsis and perhaps a state of exhausted repletion following the experiences of tension, suspense, and fulfillment.

Roger Scruton focuses on the listener’s sympathetic participation in the music in his account of musical expressiveness. He begins by suggesting that, because music cannot express exact states of mind, transitive notions of expressiveness give way to an intransitive conception of it. As a result, the import of expression in music lies in the listener’s response. Scruton claims that the listener’s response to expressive music is essentially a sympathetic one, a response to “human life, imagined in the sounds we hear.” For Scruton, the sympathetic response includes not only feelings, but actions and gestures as well. In order to hear music with understanding, we must move with it internally. Ultimately, for Scruton, our sympathetic response, our ‘moving with’ the music, is defined by the fact that music avoids explicit statement, while still inviting the listener to ‘enter into’ its expressive content. The experience of musical expressiveness consists in hearing it as “metaphorical movements in a metaphorical space.” The sounds are heard as figurative life, so that “you are the music while the music lasts.” In addition, though he does not believe it expresses any kind of cognitive content, Scruton suggests that the expressive qualities of a significant musical work can allow us to rehearse emotions that are otherwise very hard to feel.

Like Scruton Christopher Peacocke gives a central place to metaphor in the experience of musical expressiveness. In a recent influential paper, Peacocke suggests that when music is heard as expressing a particular property, some feature of it is heard “metaphorically-as” that property. Offering a non-linguistic account of metaphor informed by current accounts in cognitive science, Peacocke argues that in listening to music metaphor “is exploited in the perception, rather than being represented.” Thus when a piece of music succeeds in expressing a particular property, some of its features are perceived metaphorically-as possessing some of the characteristics of this property. This may occur at a single moment, or through the development of the music over time.

In a reply to Peacocke, Malcolm Budd contrasts his characteristically minimalist account of metaphorical content as the listener capturing some character of the music as he perceives it, with Peacocke’s account of the perceived property as a constituent of the intentional content of the listener’s perception. Budd questions what information a metaphorical-as constituent of a perception carries. He suggests that if it is no information, then the claim of metaphorical-as perception to cognitive status lapses. Kivy also raises questions about Peacocke’s account, asking the normative question of what metaphorical readings are permissible in Peacocke’s sense of metaphorically-as. He worries that it is unclear whether the account places limits on what can be heard metaphorically-as, leaving open the possibility that anything is permissible.

d. The Expression of Negative Emotions

The traditional question of the value of negative emotions in aesthetic experience applies to classical music as it does to the other arts. However, the question involves additional challenges in the case of pure music if one considers such music to be both abstract and highly expressive. In arguing for a specifically musical emotion that is both pleasurable to experience and universal to all aesthetically significant works of music, Gurney sidesteps the issue altogether. Nelson Goodman addresses the question by suggesting that in aesthetic experience “emotions function cognitively,” meaning that we use emotions to understand the aesthetic content of the work. In an influential essay entitled “Music and Negative Emotion” Jerrold Levinson accepts the suggestion made by Goodman and argues that Aristotle’s original claim of catharsis also has substantial merit in the case of classical music. Beyond these he identifies six additional “rewards” that may be associated with listening to music expressive of negative emotions, most having to do with benefits associated with experiencing and understanding emotions, either ours or another’s. Stephen Davies, by contrast, suggests in Musical Meaning and Expression that there is no real difference between our willingness to expose ourselves to negative emotions in music and our willingness to do so in other areas of life, so the question is more about our response to the human condition than it is about listening to music. A related possibility is that negative emotions in music offer a truthful reflection of our experience outside of music, and that we value such music in part because it affirms a reality we experience in our lives.

6. Human Experience and Values

Beyond the claim for emotionally expressive content in music, some writers have suggested that classical music possesses content that reflects aspects of human experience and values that surpass the expression of emotion, mood, and feeling, or the interplay of imagined personae. Wilhelm Dilthey and Jean-Paul Sartre both make such claims for music, and kindred claims can also be found in the writings of a number of contemporary aestheticians. However, while claims for a more significant human content in music resonate with many people, they have found only limited support among theorists because it has proven difficult to sustain an argument for the presence of this kind of content in music alone without tying the aesthetic claims to a larger philosophical framework that itself makes claims about human experience and values.

a. Dilthey and Music as the Expression of Lived Experience

Wilhelm Dilthey offers one of the most suggestive approaches to the expression of content holding a larger human significance in his late hermeneutical writings, especially in his discussion of musical understanding in “The Understanding of Other Persons and Their Manifestations of Life.” Dilthey’s argument for the expression of human experience in music depends upon a specific conception of what artistic expression consists in. Like Hegel, Dilthey holds that the psyche must obtain self-knowledge by objectifying itself. Unlike the literary, dramatic, and visual arts, however, music alone cannot make use of things or images from the shared external world, nor can it make use of the ability of words and images to refer to the inner world of emotions, perceptions, thoughts, and ideas. Instead, Dilthey argues, music transforms lived experience into a form of expression all on its own in a way that that opens up areas of human experience not accessible to the other arts.

The composer does not translate feelings that arose outside of music into musical terms. Rather the composer develops a capacity for specifically musical feelings through immersion in a musical tradition, in this case the tradition of Common Practice tonality together with all of the expressive techniques developed within this framework by individual composers.  This capacity allows the composer to transform non-musical experiences into musical ones. Unlike most other expressive arts, music does not achieve its meanings through signification or representation. Instead, the capacity for musical feelings, as developed in relation to a musical tradition, takes the place of the capacity for signification found in language or that of representation found in the visual arts. Every art requires some vehicle or means through which to pursue the goal of appropriating the human world. In the case of music, Dilthey suggests, this vehicle is a capacity for musical feelings developed within a specific cultural tradition.

Expressions of lived experience in music, then, are expressions, not just of the uniquely individual experience of the composer, but of individuality perceived against a particular cultural-historical background. Expressions of lived experience express not only the individuality of the composer’s experience, but also the composer’s experience as it is determined by cultural and historical factors. As Edward Lippman points out, a primary reason why Dilthey is able to develop his argument as he does is that he interprets the arts as a whole in relation to a conception of interconnected cultural systems that are themselves part of the “overall nexus of life.” It is only because classical music consists in a tradition that is interwoven into this nexus that it can transform lived experience into an object of artistic expression.

b. Sartre, Adorno, and Music as a Social Force

Offering a major revision of the theory of music that he presents in The Psychology of the Imagination, Jean-Paul Sartre argues in “The Artist and His Conscience” that rather than consisting in an object of ideal beauty, music instead expresses cultural-historical values. Sartre explores the musical work as a historical and cultural totality, which simultaneously reflects and transcends its time. He identifies music as a “non-signifying art,” one that does not refer beyond itself, but nevertheless possesses a meaning. This meaning cannot be adequately expressed by any system of signs, but instead “is always a matter of a totality, a totality of a person, a milieu, time, or human condition.” Sartre’s focus in this essay is upon the possibility of music as a committed art form, by which he understands an art form that furthers human freedom. As George Bauer points out, for Sartre the goal of the musician is to find a means of “revealing the liberty of the human condition within his compositions–even to the untutored.”

Sartre’s basic claim is that the aesthetic choices a composer makes reflect the values of the composer’s cultural-historical context. Although Sartre does not deny that music is capable of reflecting the individual values of the composer, he is primarily interested in the way that music reflects, and possibly allows for the transcendence of, the human situation in a particular time and place. Sartre’s claim stems from the intuition, present in Western philosophical thinking about music since the time of Plato, that music has social and political implications, that it can be a transformational force and a potential threat to the established order.

Like Sartre, Theodor Adorno interprets strictly musical qualities in classical music to have social and political implications. Although his influential sociological interpretation and critique of classical music lies outside the scope of the aesthetics of classical music, in his writings on specific composers Adorno identifies political and social implications in classical music as well as other significant human content in the composer’s treatment and alteration of musical conventions. In his writing on Mahler, Adorno argues that a social critique is evident in the relationship the composer establishes between the individual theme and the larger symphonic form. Traditionally conceived as a problem in Mahler, Adorno claims that in fact Mahler’s liberation of individual themes from ties to the larger formal structure establishes an “archaic banality,” akin to improvisation, which is “located prior to the constitution of the harmonically symmetrical relationships and corrodes them.” Seen in this light the true significance of Mahler is that he is “using the archaically corroded material of romanticism … in protest against the bourgeois symmetry of form.” Against this symmetry he opposes “the free contours of the freshly trodden landscape of the imagination.” Thus, Adorno finds in Mahler’s alteration of conventional musical relationships a subversion of the bourgeois order, which is capable of elevating the social awareness of the listener.

Adorno finds another kind of human significance in the late style of Beethoven, arguing that his late style reveals the ultimate inability of art to address the human condition. The traditional view held that Beethoven’s late work reflects “an uninhibited subjectivity … which breaks through the envelope of form to better express itself.” Against this view Adorno argues that in Beethoven’s works generally, rather than breaking through form, the composer’s subjectivity creates it. The middle Beethoven transforms his musical materials according to his intention, freeing them from convention through the compositional uniqueness that he achieves. The late Beethoven, by contrast, makes use of “conventions that are no longer penetrated and mastered by subjectivity, but simply left to stand.” According to Adorno these conventional materials exist in a fractured landscape that reflects the composer’s encounter with mortality: “the finite powerlessness of the I confronted with Being.” Thus, Adorno concludes, “[i]n the history of art late works are catastrophes.”

c. Contemporary Theories

More recently, Patricia Herzog has argued that purely instrumental music can convey content of profound significance to human life and that the value of such music resides largely in the value of the content that is conveyed. Purely formal accounts of music overlook this content and consequently cannot offer insight into the most important aspects of musical value. In Herzog’s view, music criticism must seek to articulate aesthetic value by grasping human values in music. Drawing on the work of Edward Cone and Joseph Kerman, Herzog bases her argument on the intuition that music contains a significance to human life that cannot be grasped by limiting the study of music to intramusical relations and any expressive content these abstract forms may yield.

Herzog claims that grasping purely intramusical meanings will never answer the important questions about music, since such meanings fail to provide a sufficiently rich interpretive vocabulary and “do not generate categories that tell us why music matters.” These questions must be answered through an evaluative connection to the music, one that links the music to human interests. For Herzog, the best works of classical music possess a recognizable conceptual content of human significance. The profundity of this content plays a major role in determining the work’s aesthetic value. Aaron Ridley also claims that music can convey a profound content. Drawing on the music criticism of J.W.N. Sullivan and echoing Dilthey, Ridley argues that a certain works of classical music convey the depth and quality of the artist’s experience of life and that through listening to them the music gives us the opportunity “to grasp, or at least to gain an inkling of, a state of soul or an outlook of extraordinary depth.” Arguing against positions such as those of Herzog and Ridley, in Music Alone Peter Kivy questions whether it is possible to articulate the profundity of music. Kivy suggests that the profundity of music can only be possessed directly through the listening experience. He agrees that music matters, but denies that its profundity consists in a content that can be articulated in terms of human experience and values. Kendall Walton takes a more moderate approach to extramusical content in purely instrumental music, proposing that while music does not, as some have suggested, call for imaginative interpretations of musical content in non-musical terms, it does call for “imaginative introspection”. This means that in the listening experience we imagine feeling particular emotions tied to the content of the music. Walton also suggests that music presents non-psychological properties such as struggle and achievement. According to Walton music’s reference to extramusical realities, though imprecise, is important to explaining the power of music as an art form.

In The Aesthetics of Music Roger Scruton holds that we hear music as purposeful “in the manner of human intention,” and thus events are not just perceived as movement, but as action (though he rejects the need for reference to an imaginary subject). Scruton argues that because we experience music as “figurative life,” music embodies and transmits the values of the culture that produces it. When we enter into the music through sympathetic listening, we rehearse the patterns of emotions that correspond to those values. Like Plato, Scruton suggests that music exercises an influence on our character. He draws an analogy to dance and its evolution from the Baroque period onward, Scriton claims that through the feelings it causes us to experience in our sympathetic engagement with its gestures, classical music educates our emotions, in contrast to popular music, which increasingly represents the decline of Western musical culture, a progressive movement toward disorder lead by the sexual impulse. Appreciating classical music, Scruton argues, is a form of latent dancing so that “the search for objective musical values is one part of our search for the right way to live.”

Theories that find music alone to be capable of expressing aspects of human experience and values must account for how an apparently abstract art can convey such content. Though attempts continue to be made to explain how music achieves this kind of result, most theorists find the attempts made to date to be unsatisfactory. Dilthey’s hermeneutical account would appear to be among the most well developed, but it relies upon additional assumptions about the nature of artistic expression and the compositional process that most theorists would not accept, or at the very least would find to be in need of significant additional exploration. Thus, while theories claiming the expression of human experience and values appeal to the common intuition that certain works of classical music possess a meaning that has larger implications for human life, definitive identification of such meanings has proven to be elusive.

7. References and Further Reading

  • Adorno, Theodor. “Late Style in Beethoven.” Trans. Susan Gillespie. Raritan 13:1 (1993):102-06.
    • A reinterpretation of the meaning of stylistic qualities in Beethoven’s late works.
  • Adorno, Theodor. “Mahler Today.” Essays on Music: Theodor Adorno. Ed. Richard Leppert. Trans. Susan Gillespie. Berkeley: University of California Press, 2002.
    • Advances the claim that Mahler’s deviation from the thematic techniques of tonal harmony should be understood as an artistic subversion of the Bourgeois order.
  • Bauer, George Howard. Sartre and the Artist. Chicago: University of Chicago Press, 1969.
    • An analysis of Sartre’s use of art and artists to convey his conception of the difference between being and existence as it relates to art.
  • Beardsley, Monroe. “Understanding Music.” On Criticizing Music: Five Philosophical Perspectives. Ed. K. Price. Baltimore: Johns Hopkins University Press, 1981.
    • Extends Goodman’s concept of exemplification to music.
  • Budd, Malcolm. Music and the Emotions. London: Routledge, 1985.
    • A penetrating critical examination of influential theories of emotion in music, including those of Hanslick, Gurney, Schopenhauer, Cooke, Langer, and Meyer.
  • Budd, Malcolm. “Musical Movement and Aesthetic Metaphors.” British Journal of Aesthetics 43:3 (2003): 209–23.
    • Argues against Scruton’s account of musical motion in terms of spatial metaphors understood metaphorically, suggesting it is favorable to conceive of musical motion in terms of a purely temporal Gestalt.
  • Budd, Malcolm. “Response to Christopher Peacocke’s ‘The Perception of Music: Sources of Significance.’” British Journal of Aesthetics 49:3 (2009): 289-92.
    • An evaluation of Peacocke’s conception of the role of metaphor in music.
  • Budd, Malcolm. Values of Art. London: Penguin, 1995.
    • Compliments his earlier work with the addition of a “basic and minimal” conception of emotion in music as well as an exploration of the value of music as an art form.
  • Carroll, Noël. “Art and Mood: Preliminary Notes and Conjectures.” The Monist 86:4 (2003): 521-555.
    • Explores the possibility that musical moods can offer a solution to the debate between formalist and arousalist positions.
  • Clifton, Thomas. Music as Heard: A Study in Applied Phenomenology. New Haven, Conn.: Yale University Press, 1983.
    • Considers the experience of music from a phenomenological perspective.
  • Cochrane, Tom. “Music, Emotions and the Influence of the Cognitive Sciences.” Philosophy Compass 5:11 (2010): 978–88.
    • Suggests that psychology and neuroscience can provide additional support for one theory of our experience of music over another, as well as in some cases allow us to reframe and synthesize traditionally distinct positions.
  • Cone, Edward T. The Composer’s Voice. Berkeley: University of California Press, 1974.
    • Argues for a theory of musical communication based on the composer’s musical personae.
  • Cook, Nicholas. Music, Imagination, and Culture. Oxford: Clarendon, 1990.
    • Examines music from the point of view of the composer and the listener, arguing that the role of the listener is of primary importance.
  • Cooke, Deryck. The Language of Music. Oxford: Oxford University Press, 1964.
    • Seeks to show that certain recurrent patterns present in the music have specific emotional meanings, making it possible to construct a basic emotional vocabulary of classical music.
  • Dahlhaus, Carl. The Idea of Absolute Music. Trans. Roger Lustig. Chicago: University of Chicago Press, 1989.
    • A hermeneutical inquiry into the history of our conception of absolute music.
  • Davies, Stephen. Musical Meaning and Expression. Ithaca: Cornell University Press, 1994.
    • A comprehensive discussion of major issues in musical aesthetics, including a presentation of his mirroring response theory of musical expression.
  • Davies, Stephen. Musical Works and Performances. Oxford: Clarendon, 2001.
    • An in-depth exploration of the nature of musical works and of authenticity in musical performances.
  • Davies, Stephen. Musical Understandings and Other Essays on the Philosophy of Music. Oxford: Oxford University Press, 2011.
    • A collection of essays addressing the listener’s response to the expression of emotion in music, the role of the listener in the perception and understanding of music, as well as other central issues in musical aesthetics.
  • DeBellis, Mark. “Music.” The Routledge Companion to Aesthetics. Ed. Berys Gaut and Dominic McIver Lopes. New York: Routledge, 2001.
    • An overview of major topics in musical aesthetics.
  • Dilthey, Wilhelm. Selected Works, Vol. 3: The Formation of the Historical World in the Human Sciences. Ed. Rudolf Makkreel and Frithjof Rodi. Princeton: Princeton Univeristy Press, 2002.
    • Contains Dilthey’s late hermeneutical approach to musical aesthetics in the essay “The Understanding of Other Persons and Their Manifestations of Life.”
  • Goehr, Lydia. The Imaginary Museum of Musical Works. Oxford: Oxford Univeristy Press, 1994.
    • Offers a genealogy of the concept of a musical work from antiquity onward, arguing that no analytic method can succeed in defining musical works and that before 1800 compositions and performances were not governed by the work concept.
  • Goldman, Alan. “The Value of Music.” Journal of Aesthetics and Art Criticism 50:1 (1992): 35–44.
    • Argues that music presents us with another world, separate from everyday life.
  • Goodman, Nelson. Languages of Art. Indianapolis: Bobbs-Merrill, 1968.
    • Highly influential work exploring the nature of musical expression and the relationship between works and performances.
  • Gracyk, Theodore and Andrew Kania, eds. The Routledge Companion to Philosophy and Music. New York: Routledge, 2011.
    • A comprehensive guide to major topics and thinkers in musical aesthetics.
  • Gurney, Edmund. The Power of Sound. New York: Basic Books, 1966.
    • A monumental study drawing on evolutionary theory to analyze the nature of musical expression.
  • Hanslick, Eduard. On the Musically Beautiful. Trans. Geoffrey Payzant. Indianapolis: Hackett, 1986.
    • Classic treatise in musical aesthetics, arguing that aesthetic value in music is purely formal in nature.
  • Herzog, Patricia. “Music Criticism and Musical Meaning.” Journal of Aesthetics and Arts Criticism 53: 3 (1995): 299-312.
    • Makes the case for content of a profound human significance in classical music.
  • Kant, Immanuel. Critique of Judgement. Trans. J.H. Bernard. New York: Hafner, 1951.
    • A foundational text in aesthetics; evaluates whether music is a proper object of aesthetic judgements.
  • Kivy, Peter. The Corded Shell. Princeton: Princeton University Press, 1980.
    • Presents the author’s contour theory of musical expressiveness, supplemented by a convention theory that accounts for our responses to those aesthetic qualities not addressed by the contour theory.
  • Kivy, Peter. “Mood and Music: Some Reflections for Noël Carroll.” The Journal of Aesthetics and Art Criticism, 64:2 (2006): 271-281.
    • Assesses Carroll’s account of the evocation of moods in classical instrumental music.
  • Kivy, Peter. Music Alone: Philosophical Reflections on the Purely Musical Experience. Ithaca: Cornell University Press, 1990.
    • Considers the experience of textless instrumental music, clarifying and defending the author’s cognitivist position.
  • Kivy, Peter. New Essays on Musical Understanding. Oxford: Clarendon, 2001.
    • A collection of essays addressing historical topics, emotional expression, and concatenationism vs. architectonicism.
  • Langer, Susanne K. Philosophy in a New Key. New York: Mentor, 1956.
    • Argues that works of music should be understood as unconsummated presentational symbols and as such symbolize.
  • Levinson, Jerrold. Music, Art, and Metaphysics. Ithaca: Cornell University Press, 1990.
    • An influential work containing six essays on musical aesthetics and covering topics such as the definition, ontology, meaning, performance, and appreciation of music.
  • Levinson, Jerrold. “Music as Narrative and Music as Drama.” Mind and Language 19:4 (2004): 428-441.
    • Argues that that it is natural to hear music as drama and that doing so benefits from the introduction of an imagined persona, while attempting to hear it as narrative poses significant problems.
  • Levinson, Jerrold. Music in the Moment. Ithaca: Cornell University Press, 1997.
    • Presents a sustained argument for concatenationism.
  • Lippman, Edward. A History of Western Musical Aesthetics. Lincoln: University of Nebraska Press, 1992.
    • A thorough survey of influential figures, with an emphasis in its 20th century coverage on continental aesthetics.
  • Lippman, Edward. Musical Aesthetics: A Historical Reader. 3 vols. New York: Pendragon Press, 1986.
    • An excellent source book in musical aesthetics.
  • Meyer, Leonard B. Emotion and Meaning in Music. Chicago: University of Chicago Press, 1961.
    • A foundational inquiry into musical meaning, focusing on expectation generated by antecedent-consequent relationships.
  • Meyer, Leonard B. Music, the Arts, and Ideas. Chicago: University of Chicago Press,
    • Reworks central aspects of the theory presented in Emotion and Meaning in Music.
  • Narmour, Eugene. The Analysis and Cognition of Basic Melodic Structures. Chicago: University of Chicago Press, 1990.
    • A further development of the basic approach established by Meyer.
  • Nattiez, Jean-Jacques. Music and Discourse: Toward a Semiology of Music. Princeton, N.J.: Princeton University Press, 1990.
    • Argues that music possesses a syntax and thus can be interpreted similarly to any other system of signs.
  • Peacocke, Christopher. “The Perception of Music: Sources of Significance.” British Journal of Aesthetics 49:3 (2009): 257-275.
    • An influential paper arguing that in listening to music metaphor is “exploited in the perception, rather than being represented.”
  • Ridley, Aaron. Music, Value, and the Passions. Ithaca: Cornell University Press, 1995.
    • Focuses on the melismatic gesture as a central component of musical expressiveness.
  • Robinson, Jenefer. Deeper than Reason: Emotion and its Role in Literature, Music, and Art. Oxford: Clarendon, 2005.
    • Drawing on the author’s own theory of emotion, offers an account of musical expression and of the capacity for music to arouse emotions in the listener.
  • Sartre, Jean-Paul. The Psychology of Imagination. New York: Citadel, 1991.
    • Sartre’s early account of music as presenting ideal beauty.
  • Sartre, Jean-Paul. Situations. Trans. Hazel E. Barnes. New York: George Braziller, 1965.
    • Contains the essay, “The Artist and His Conscience,” which argues that music captures a historical milieu and additionally that music can be a transformational force used to further human freedom.
  • Schenker, Heinrich. Free Composition. Trans. and ed. Ernst Oster. New York: Longman,
    • Classic treatise in musical analysis emphasizing the architectonic aspects of musical compositions.
  • Schopenhauer, Arthur. The World as Will and Representation. Trans. E.F.J. Payne. Indian Hills, Col.: Falcon’s Wing Press, 1958.
    • Presents Schopenhauer’s philosophy of music as having the privileged status of being a direct presentation of the will, which is the thing-in-itself or underlying metaphysical reality.
  • Scruton, Roger. The Aesthetics of Music. New York: Oxford University Press, 1997.
    • A thorough and insightful discussion of many of the major issues in musical aesthetics, including spaciality, ontology, expression, understanding, content, and both experiential and cultural value.
  • Scruton, Roger. “Musical Movement: A Reply to Budd.” British Journal of Aesthetics 44:2 (2004): 184–7.
    • Argues for the indispensability of metaphor in the listening experience.
  • Serafine, Mary Louise. Music as Cognition: The Development of Thought in Sound. New York: Columbia University Press, 1988.
    • Identifies twelve cognitive processes that are components of musical cognition and assesses experiments on people of different ages intended to shed light on how these processes develop.
  • Walton, Kendall. “What is Abstract about the Art of Music?” Journal of Aesthetics and Art Criticism 46:3 (1988): 351-364.
    • Argues that music’s reference to extra-musical realities such as unnameable feelings and the dynamics of emotions, though imprecise, is important to explaining the power of music as an art form.
  • Zangwill, Nick. “Music, Metaphor, and Emotion.” Journal of Aesthetics and Art Criticism 65:4 (2007): 391–400.
    • Argues against emotion theorists, claiming that what we experience in response to music is in some ways similar, but not equivalent to, actual emotion, and that instead of taking emotional descriptions of music literally, we should instead understand them as aesthetic metaphors.
  • Zuckerkandl, Victor. Sound and Symbol. Tr. Willard Trask. New York: Princeton University Press, 1956.
    • An influential early study investigating our experience of tone, motion, time, and musical space.

 

Author Information

Michael Bazemore
Email: mbazemore01@gmail.com

U. S. A.

Ancient Ethics

Ethical reflection in ancient Greece and Rome starts from all of an agent’s ends or goals and tries to systematize them. Our ends are diverse. We typically want, among other things, material comfort, health, respect from peers and love from friends and family, successful children, healthy emotional lives, and intellectual achievement. We see all these things as good for us. So, systematizing our ends involves considering how various goods that we have or seek fit together. In particular, it involves thinking about what makes life good overall—what a happy human life consists in. In ancient ethical theory, then, the core question is: how can I live well? That is, how can I flourish and live a happy life? To a first approximation, happiness consists in having good things, but this formula must be read liberally. The most important goods in life may be activities or experiences, not things that one has in a quite narrow sense. If so, then happiness—having good things—centrally involves the relevant activities or experiences.

Rational reflection on these questions is not just an odd intellectual pursuit unconnected from living life well. Rather, the ancients agree that practical intelligence or wisdom—some sort of understanding of how our ends and goals fit together—is central to living well. We must grasp which ends subserve others (instrumentally or constitutively), which ends are important to our lives as a whole and which are not, and which ends we should reconceive, restrain, abandon altogether, or newly introduce because of how they fit (or fail to fit) with others. We can then guide our lives intelligently, better achieve our ends, and so live well and be happy. This ability to guide our lives intelligently is itself good for us. In fact, it can seem good in a different way from the other ends it governs. Other goods are bad in special circumstances and can be misused. For example, strength is bad when a tyrant conscripts the able-bodied to fight in an unjust war, and it can also be used to bully others. Practical intelligence is always good and cannot be misused; it is unconditionally good for the agent. Since happiness consists in having good things, in a suitably broad sense, and since practical intelligence is a preeminent good, living well centrally involves having and exercising practical intelligence.

This introduces another main feature of ancient ethics: it gives a central role to human excellence or virtue. Practical intelligence—a systematic, coherent grasp of all the goods in a life—is a virtue. Clearly, such a virtue, which amounts to expertise at living, plays a crucial role in living well (as expertise in any domain plays a crucial role in good performance in that domain). So this virtue, at least, is necessary for happiness. By reflecting on how practical intelligence connects to other virtues, we can see why ancient ethical theories say that virtue more generally is necessary, or even necessary and sufficient, for happiness.

Table of Contents

  1. Plato
  2. Aristotle
  3. Stoicism
  4. Academic Skepticism
  5. Epicureanism
  6. Pyrrhonism
  7. References and Further Reading
    1. Primary Works
    2. Secondary Works

1. Plato

Plato says that happiness is the possession, or the possession and correct use, of goods. Correlatively, misery is the possession of bads, or the possession and incorrect use of goods. If we ask why anyone does what she does, and reach the point of showing how her action fits into a happy life, we have fully explained and justified her action; no further question about why she wants to be happy and live well is apt. Put another way, we do everything for the sake of happiness, and we need nothing beyond happiness. Wisdom is both our highest good and the ability to use other goods well and beneficially. So, wisdom should be the first concern of anyone who wants to live well and be happy—that is, everyone. In particular, wisdom is more important than bodily and reputational goods such as health and honors. But as the condition that enables skillful activity in any domain is expertise in that domain, so too the state that enables skillful activity with goods is expertise concerning goods. So, wisdom—the highest human good—is knowledge of the good.

However, a problem lurks. If wisdom is the good for a human being, and the highest good for a human being is knowledge of the good, then wisdom seems to be knowledge of itself. This is unintelligible, and even if it were intelligible, it sounds useless. So, Plato introduces the form of the Good, distinct from other goods (including the highest human good) as the proper object of wisdom. The form of the Good is good without qualification—it is not merely the good of this or that sort of thing; it is what goodness is, in relation to which other goods are (qualifiedly) good. This gives a formal characterization of the Good; more substantively, the Good is unity. So, each thing is good when it is unified; civic unity is the highest good of a city, and psychological unity the highest good of a soul. That is, the soul achieves its highest good by putting its ends and attitudes into a coherent structure. This happens by coming to know the Good; when someone grasps that, she becomes like the object of her knowledge—the Good is unity, and knowing the Good unifies the soul. This identification of the Good with unity is one reason why Plato thinks that mathematics prepares the way for ethical knowledge.

That covers wisdom and its primary object, but what about other virtues? Plato sometimes says that all the virtues simply are wisdom—for example, that wisdom enables one to rule one’s pleasures and appetites (so that temperance is wisdom) and fears (so that courage is wisdom). On this view, there is only one virtue with several names. Elsewhere, he offers a somewhat weaker view: there are several virtues, but having one requires having them all. Even the weaker view provokes surprise; common sense says one can be, for example, just but not temperate, or wise but not courageous. Both versions of the claim that virtue is unified are grounded partly in the claim that affective states represent their objects as good or bad. For example, when someone fears heights on some occasion, she fears the harms of falling—fear represents something as bad for the subject. But wisdom systematically grasps what is really good and bad for us. So, the wise person never harbors any false belief about what is really good or bad for her. Hence, she fears things only to the extent that they really are bad for her—neither more nor less. That is, she is courageous, rather than cowardly or rash. Some things that the wise person knows are not bad may still appear bad to her, though, just as perceptual illusions persist even for those who do not trust in them.

Justice is a particularly important case; two of Plato’s longest works defend the claim that justice is unconditionally good for its possessor. The Gorgias says justice is organization, so a just soul is an organized soul; the Republic says justice is the condition in which each does its own work, so a just soul is one in which each part of the soul does its own work. As with the other virtues, justice is closely connected to wisdom. Again, wisdom is knowledge of the good; it is a systematic and coherent grasp of the relationships among all the goods one seeks. So, wisdom organizes the soul, and the wise person will be just. Because justice is so closely tied to wisdom, it is unsurprising that, like wisdom, it is unconditionally good for the agent. Thus, acting unjustly for the sake of mere conditional goods (for example, wealth or political power) is never prudent. For example, one cannot betray a friend and still have an organized soul; such actions reveal deep ignorance of what goods are most important and make life go well. Some scholars have worried that one could perhaps betray friends and still have an organized soul. Addressing this concern requires reflecting on whether loyalty to friends is actually more important than wealth. If it is, then someone with an organized soul will track this fact, and will never betray friends for the sake of wealth. And if having an organized soul is unconditionally good, then betraying friends for the sake of wealth is never prudent.

As we have seen, Plato thinks virtue is closely related to happiness. In particular, virtue is necessary for happiness—the vicious are not happy, but miserable. But we have not yet seen whether he thinks virtue suffices for happiness, or what else might conduce to happiness. Two important commitments in this regard—which Plato never explicitly thematizes, but regularly assumes—are that virtue and happiness (and vice and misery) come in degrees. Because virtue is the central determinant of happiness, it seems clear that as one becomes more virtuous, one becomes happier. One might take virtue to be the sole determinant of one’s degree of happiness. But in fact, Plato thinks that goods and bads other than virtue and vice—conditional goods such as wealth and honors—are relevant to how happy one is. These have opposite effects on the virtuous and vicious. Somebody with a certain degree of virtue, but with more conditional goods, is happier than somebody with the same degree of virtue but without those goods, or with correlative conditional bads. Somebody with a certain degree of vice, but with more conditional goods, is more miserable than somebody with the same degree of vice but without those goods, or with correlative conditional bads.

The reason for this is that conditional goods enable one to exercise one’s character more widely, while conditional bads prevent one from exercising one’s character as widely. Conditional goods thus allow a virtuous person to exercise her virtue more widely and a vicious person to exercise her vice more widely—they allow virtuous and vicious people to perform more virtuous and more vicious actions, making them happier and more miserable, respectively. Conditional bads keep virtuous and vicious people from performing actions that express their virtue or vice as fully, which makes them less happy and less miserable, respectively. Plato may think that these activities affect our happiness or misery directly, or he may think that their influence on our happiness is fully mediated by how they further shape our characters; he never commits himself one way or the other.

Plato thinks the highest human good is systematic knowledge of the Good (unity) together with the virtues identical to or entailed by that knowledge. Naturally, he rejects competing candidates for the highest human good, such as pleasure, love and friendship, and artistic achievement. In each case, he says how these other plausible candidates relate to his view.

The main alternative way of trying to unify our ends is hedonism, the view that the good—which we do everything for the sake of and which is all we need—is pleasure. Plato argues against hedonism in two main ways: (i) pleasure and pain occur together and cease together in the same place at the same time, as opposites like good and bad do not; (ii) pleasure is a process of restoration culminating in a good, harmonious condition, so pleasure cannot be the same as the good, harmonious condition it culminates in. These points are related: since pain is the felt disturbance of a good, harmonious condition, and pleasure the felt restoration to a good, harmonious condition, pleasure and pain (for example, pains of hunger and pleasures of eating) often occur together and cease together. This observation also allows Plato to argue that the virtuous live most pleasantly (although their pleasures do not make them happy). Because most bodily and reputational pleasures coincide with contrasting pains, they seem more intense than they are. (Compare how colors seem more intense against a contrasting background.) In fact, though, the pleasures associated with virtue and knowledge are larger than bodily and reputational pleasures—or so Plato argues.

Plato takes a similar line on love, friendship, and art: he denies that any of these provide the principles around which one can successfully organize one’s ends and live well, but he recognizes that they play important roles in such a life. When two people love each other and are friends, we can ask about the basis of their friendship. Any old relationship does not make life go well, and relationships directed at some objects can actually keep us from living well. So, we must say what love and friendship are for; Plato suggests that proper love and friendship are directed at the human good—at wisdom and virtue. But love and friendship are not just one way to seek wisdom and virtue. Plato always emphasizes the social character of philosophy (that is, love of wisdom). His approach to art is similar: the wrong kind corrupts us when young and tempts even good adults to hold vicious attitudes. However, the right kind of art is important to developing good character in childhood and to sustaining good character through an entire life.

One last topic deserves mention: Plato thinks that the soul is immortal and transmigrates. This is relevant to his ethics not because he thinks one should act differently in this life because the soul is immortal, but because it raises the stakes for decisions made in this life. Our choices have ramifications for our character in the afterlife and in our next life. So, Plato thinks about character development in the very long term—over many cycles of birth and death, covering many thousands of years.

2. Aristotle

Aristotle was Plato’s student, so we should not be surprised to see him developing similar ethical views. Still, there are differences of emphasis, points on which Aristotle is more explicit, and some points of clear disagreement between them.

Aristotle provides formal criteria for our final end—happiness—that closely resemble Plato’s. We do everything for the sake of happiness and do not seek it for the sake of anything further, and we need nothing beyond happiness. The best candidate for something of this sort, he argues, is a full life of excellent rational activity. Some readers think Aristotle has a compound theory of happiness: it is a full life of excellent rational activity, plus external goods such as health, wealth, good looks, and good children. However, Aristotle clearly distinguishes what happiness consists in from what it needs as background conditions, and he thinks happiness needs external goods as background conditions, not as constituents. There are two reasons for this. First, excellent rational activity requires some external goods as tools; second, lack of some external goods “spoils our blessedness.” One way to understand the latter claim is to notice that excellent rational activity must be unimpeded and pleasant; since everyone wants external goods, we need some not to be pained at their lack. Aristotle and Plato agree in thinking that the virtuous person lives a better life with more external goods, but Aristotle thinks that enough external bads hinder excellent rational activity. They make the virtuous person unable to exercise her virtues fully, either for lack of tools or because her activities are impeded by pain. Plato thinks rather that lack of external goods or presence of external bads cannot prevent the virtuous person from living well, but only that these can prevent her from living the happiest possible life.

Aristotle distinguishes two kinds of virtues that rational creatures can have and exercise: intellectual and character virtues. The highest intellectual virtue is wisdom (sophia), which combines a grasp of the world’s highest principles (nous) and ability to reason deductively from them (epistêmê). The first principle of the world is the source of change that does not itself change—often called the “unmoved mover,” or God. Aristotle calls God the highest good, with which he proposes to replace Plato’s form of the Good. Plato distinguishes the form of the Good from any thinkers or thoughts about goodness, and identifies God with intelligence. But Aristotle says God is both thinker and object of thought. Plato’s God is personal; Aristotle’s is impersonal and does not think about the things it changes. God changes other things not by deliberating and acting, but by being what changing things strive to be like, to the extent possible. For example, the stars change in the smallest way possible for things that change: by circular motion. A life spent in exercising the highest intellectual virtues, to the extent possible, is the best life, and a life most like God’s. A life spent in exercising character virtues is also happy, but we exercise character virtues in part to make contemplation possible, while we contemplate just for its own sake. Thus, exercise of the character virtues fits the constraints on our final end less well than exercise of the highest intellectual virtues.

One intellectual virtue, practical wisdom (phronêsis), has a special relationship to character virtue; nobody can have any character virtue without practical wisdom, and nobody can have practical wisdom without all the central character virtues. Thus, Aristotle subscribes to a version of the unity of virtue. Practical wisdom and the character virtues shape and govern the parts of the human being that are non-rational but susceptible to reason (which are concerned with the material and social conditions of human life). One can exercise the character virtues in private life only or also in public life; the latter involves exercising them more widely, so it is preferable and more godlike. Thus, Aristotle addresses his discussion of character virtue to those who intend to enter politics. This is related to an odd feature of Aristotle’s account of justice: because each character virtue can be exercised in relation to others, he identifies “general justice” with the entirety of virtue. Again, practical wisdom and the character virtues can be exercised privately or politically, but achieve their fullest expression politically. The virtue concerned with other people is justice, so there is a sense in which justice encompasses all of character virtue and a correlative sense in which political expertise is simply practical wisdom writ large.

Aristotle describes each character virtue as being (and hitting) a “mean” in both action and feeling. Hitting the mean in action and feeling involves doing the right thing and feeling the right thing, at the right time, in the right ways, in relation to the right people. (Hitting the mean need not involve doing or feeling a moderate amount; it can be right to perform a grand action or to refrain from acting entirely, and it can be right to feel intensely or not to feel at all.) Each virtue is a mean by falling between two vices—wit, for example, is a mean that falls between buffoonery and boorishness.

The ability to figure out what to do can come apart from feeling the right way about one’s situation. (For example, one might see that one should confront a sexist comment, but be more afraid of doing so than one should be.) In such cases, one can either do the right thing despite one’s feelings, or act on one’s feelings contrary to one’s considered judgment. In the former case, the action is continent (but not virtuous); in the latter case, the action is incontinent (but not vicious). So, continence and incontinence are states of character between virtue and vice. Aristotle also sketches a character worse than vice, “brutishness,” and a character superior to virtue, which is godlike.

The ideal of being like God returns us to an important tension in Aristotle’s treatment of external goods. Usually, Aristotle thinks that lack of external goods can ruin the happiness of a virtuous person by impeding her exercise of virtue, and possession of external goods enables the wider exercise of virtue. He even introduces special virtues concerned with great wealth (magnificence) and great honor (magnanimity). Elsewhere, though, he argues that the contemplative life is superior to the political life in part because it needs fewer external goods, and he posits a godlike state that transcends virtue in its detachment from ordinary human concerns like health and wealth. This problem also arises in the case of friends. Friendship in the core sense involves seeking to become virtuous and acting well together. But the virtuous are self-sufficient, and the self-sufficient need friends least; so, the virtuous need friends least. In particular, the more godlike someone becomes, the less she needs friends at all. Aristotle has ways of trying to address this problem: he says that the virtuous need friends so that they have someone to benefit, and in order to best enjoy activities that are their own (since a friend is a “second self”). However, these two strands of Aristotle—one stressing the need for external goods and friends, the other stressing the need for independence from external goods and friends—remain in tension.

3. Stoicism

Stoicism comprises a centuries-long tradition, involving considerable disagreement among its adherents. This article focuses mainly on early Stoicism as articulated by its first three scholarchs: Zeno, Cleanthes, and especially Chrysippus. Some of the claims called “Stoic” here are rejected by other, later Stoics such as Panaetius and Posidonius. Some are rejected by an important early Stoic, Aristo, who lost a struggle to define the movement and so was retroactively deemed heterodox. There are disagreements among the earliest scholarchs as well, only a few of which are tracked here.

“Nature” plays a significant role in Plato and Aristotle’s ethics, especially in the contrast between nature and convention. But nature as a central organizing principle in ethical theory takes off in the Hellenistic period. For the Stoics, this emerges in their formula for the final end, “living in accordance with nature.” Cleanthes understands “nature” here as cosmic nature, while Chrysippus understands both cosmic and human nature.

One key appeal to human nature comes in the form of a “cradle argument,” which uses the behavior of unsocialized babies to establish what is natural and not merely conventional. The Stoics say that a newborn first finds herself and her constitution congenial (oikeion). So, she has an impulse to preserve herself and her constitution. Thus, the newborn finds whatever preserves herself and her constitution congenial, and has an impulse toward them; she finds whatever destroys herself and her constitution uncongenial, and has an impulse away from them. Our constitution includes bodily, psychological, and social abilities. At first, these are unsophisticated; the baby can flail her limbs, perceive her surroundings, and demand food from her caretakers. All these capacities are natural to her, congenial to her, and she has an impulse to exercise and preserve them. In short, the uncorrupted baby, her capacities, the exercise of those capacities, and whatever conduces to the preservation and exercise of herself and her capacities, have value for her. The opposites all have disvalue.

Next, the Stoics sketch the development of more bodily, psychological, and social abilities. We can stand, walk, and run; we can distance ourselves from appearances and assess whether things are as they seem; and we can engage in reciprocal relationships with others. These developments are natural to us. We continue to find ourselves and our developing constitutions congenial and have an impulse to exercise and preserve ourselves and our constitutions. Again, all these things have value for us and the opposites have disvalue.

Some key psychological aspects of our constitution are the capacity to receive impressions (for things to seem a certain way); the capacity to assent to impressions and so form beliefs, or else to withhold assent; the capacity to receive “graspable impressions” (true impressions that could not possibly be false); and the ability to distinguish graspable from non-graspable impressions, and assent to the former but not the latter. Assent to a graspable impression produces a grasp (katalêpsis), which constitutes an infallible awareness of a small part of reality. Grasps are the Stoic “criterion of truth”—the proper touchstone for any inquiry or argument—but they do not amount to knowledge. Knowledge requires stability, even in the face of dialectical examination (as it did for Plato). That requires assenting only to graspable impressions and organizing one’s grasps into a stable explanatory structure. This sets a high bar for knowledge (and for virtue, which, as we shall see, the Stoics identify with knowledge). Few humans, if any, ever attain knowledge. Still, grasps are a stepping stone; both the wise and the foolish have them, and they offer a path from foolishness to wisdom. Even though few of us make it, wisdom is the natural end point of human development.

This brings us back to value, which is distinct from goodness. Only what always benefits is good, just as only what always makes things hot is heat. That is, goodness is unconditional value. Most valuable things lack unconditional value (are not good) for familiar reasons: in special circumstances, things that are ordinarily valuable are disvaluable, and most valuable things can be misused. So, the Stoics call conditionally valuable things preferred indifferents, which should be selected; conditionally disvaluable things are dispreferred indifferents, which should be rejected. Things of no value or disvalue, or very little, are strictly indifferent and should be neither selected nor rejected. Only good and bad things should be chosen and avoided; these unconditional impulses are only fittingly directed at good and bad objects.

This introduces a crucial concept: appropriate actions (kathêkonta), or actions that admit of a reasonable defense. Importantly, the agent need not be able to provide such a defense to perform an appropriate action. (Even non-rational animals have and can perform their own appropriate actions.) As the wise and foolish both have grasps, so both the virtuous and vicious can perform appropriate actions. However, only the wise person can defend her grasps and her actions in the face of all questioning. Since the wise person (also called the sage) does appropriate actions for the right reasons, the Stoics call her actions right actions (katorthômata). The sage’s rational defense of her actions appeals to the value and disvalue of the preferred and dispreferred indifferents at stake, and explains how her selections and rejections respond appropriately to that value and disvalue. There are no action-types (aside from virtuous actions) that the sage always performs; occasionally, even cannibalism and incest are appropriate actions.

If the sage appeals to the value and disvalue of indifferents to explain her actions, where do virtue and the good enter the picture? Start from the developing agent who not only reacts immediately to particular valuable and disvaluable things, but who can compare value and disvalue and sometimes, at least, find the appropriate action. The next step in proper development is to perform appropriate actions regularly and reliably. Eventually, the agent appreciates how appropriate actions fit together into an orderly, harmonious life. At this point, the developing agent comes to see that the order and harmony of her life—made possible by reasoning about value and disvalue—has a value different in kind from the value of the things she reasons about. That order and harmony is, in a word, good.

The primary good thing in Stoicism is virtue, or practical intelligence about comparative selection-value. (Other goods include virtuous activity, the virtuous agent, and a friend—only the good are friends, because only they harmonize with themselves and each other.) The virtuous person appreciates the relevant values at stake in her circumstances and has a stable, coherent view about how to compare the values at stake. (She also knows that she acts with imperfect information, so she acts “with reservation”—in the knowledge that new information may require a change of plans or attitudes.) Unlike preferred and dispreferred indifferents, one would always rather have virtue so understood, and it cannot be misused. That is, virtue has unconditional value—it is good. The sage selects and rejects indifferents constantly and firmly and so has the “smooth flow of life” that the Stoics call happiness.

Since happiness is the possession (or possession and correct use) of goods, and since the Stoics think virtue is the only good and cannot be misused, the virtuous person is happy. The sage’s happiness does not depend upon whether she actually acquires preferred indifferents and not dispreferred indifferents; that is why they are indifferent (with respect to happiness). Virtue is perfect psychological coherence, which does not come in degrees, so neither does happiness. Thus, the sage is fully happy even on the rack (because she has and exercises virtue) and she always acts virtuously. Cicero illustrates this point with the example of Regulus, a Roman general who was captured by the Carthaginians. Regulus promised that he would carry terms of surrender back to Rome and then return. When he arrived in Rome, he argued against accepting the terms, returned to Carthage as promised, and was tortured and killed there. (Notice that this counts as an appropriate action only if keeping a promise to the enemy and its effects had greater selection-value than Regulus’ physical comfort and continued life and their effects. One cannot assume that Regulus’ behavior is required by justice, because the Stoics deny such general claims as “one should always keep promises,” “one should never have sex with close relatives,” and “one should never consume human flesh.”) On the flip side, everyone who is not a sage is foolish (because we all lack perfect psychological coherence) and miserable (because we all have the only bad thing, vice). All non-sages are equally vicious and miserable, even those who are making progress (prokopê), much as those who are underwater but rising toward the surface are drowning no less than those who are not rising toward the surface.

We are now in a position to understand the view most often associated with Stoic ethics: advocacy of freedom from passions (apatheia). This does not mean that we should have no affective life at all. The Stoics have a technical definition of passions (pathê) as fresh, weak judgments that something is good or bad. (A judgment is fresh when it is newly assented to; a judgment is weak when it is unstable and so not known, even if it is true.) The four highest species of passion are pleasure, pain, desire, and fear. Pleasure and desire represent their objects as good in the present and future, respectively, while pain and fear represent their objects as bad in the present and future. The sage has good versions of three of these four: joy (reasonable elation), wish (reasonable choice), and caution (reasonable avoidance). They omit any good version of pain, which suggests that the “good feelings” (eupatheiai) are strong, known judgments about what is good and bad, and are never directed at preferred and dispreferred indifferents. The sage, being wise, will never judge that anything that is neither good nor bad—for example, any preferred or dispreferred indifferent—is either good or bad. Further, the sage never is bad, but may become bad again. So, she is fittingly cautious about future bads, but she will never experience a negative affect directed at her present badness. For as long as she is wise, she is virtuous, good, and happy, not vicious, bad, and miserable.

So far we have focused on human nature, but we saw above that Cleanthes and Chrysippus both think our end involves living in accordance with cosmic nature. Accordingly, physics (knowledge of nature in general) is a virtue. But how more specifically does knowledge of the cosmos connect to ethics? In at least two ways. First, the Stoics are pantheists—the study of nature reveals that it is providentially ordered, and indeed that the cosmos simply is God. God’s beneficial arrangement of the cosmos (that is, of God’s body) requires that God be good and virtuous. Given the paucity of human sages, physics is the study of the only virtuous, good thing we know. Second, the Stoics use the providential governance of the cosmos and our role as parts of it to argue for ethical conclusions—especially that we should value the common interest more than our own. Chrysippus uses a striking image: suppose our feet were rational. The rational foot would understand itself as part of a larger rational organism, and conduct itself accordingly. For example, given its understanding of what is valuable for the whole of which it is a part, the foot would sometimes want to be muddied. The foot might even desire to be amputated if amputation were the only way for the whole rational animal to carry on in the best way. But each human being is in fact a rational part of a rational whole, the cosmos. So, given our understanding of what is valuable for the cosmos as a whole, we should sometimes want to have dispreferred indifferents, and even sometimes to die, so that the whole cosmos can carry on in the best way.

4. Academic Skepticism

The Academics take their name from Plato’s Academy. Arcesilaus was a head of the Academy who took the school back to (what he thought were) its skeptical roots. Here he could appeal to Plato’s Socrates, who denied knowing anything important and tried to show others that they were in the same position. He could also appeal to Plato, who can be seen as distancing himself from any dogmatic views by writing dialogues, many of which end in puzzlement anyway. The Academics would argue on both sides of any question; in one famous case, Carneades—the greatest of the Academics—went to Rome and argued for justice on one day and against justice on the next. A favorite Academic target was the Stoic claim that cognitive impressions exist and can be distinguished from non-cognitive ones; debates between Academics and Stoics persisted for quite a long time.

Like other global skeptics, Academics must explain how they can maintain their skepticism without walking off cliffs. They say that they do and maybe even believe what is reasonable or plausible. Plausibility comes in degrees, and Carneades suggests three important grades: initially plausible impressions, uncontroverted impressions (which are not only plausible but also agree with related plausible impressions), and thoroughly tested impressions (which require examining each of the related plausible impressions that agrees with an uncontroverted impression). One can rely on different grades of plausibility depending on the matter at hand. To jump away from something on the ground that may be a poisonous snake, the Academic only needs a plausible impression; to decide how to live, she will want thoroughly tested impressions.

In the Academic–Stoic debate, both sides made accommodations under dialectical pressure. Eventually, one Academic, Antiochus of Ascalon, rejected skepticism and accepted views close to Stoicism in both epistemology and ethics. (Cicero, another late Academic who held more firmly to skepticism, did something similar; his De Officiis rehearses and then supplements the Stoic Panaetius’ work on appropriate actions.) Antiochus claims to be recovering an ancient consensus among Plato, Aristotle, and the Stoics. In ethics, this putative consensus says that virtue suffices for happiness, but possession of external and bodily goods makes the happy person happier, while their lack makes her less happy. The Stoics (says Antiochus) just use new and misleading language to state this consensus view. Antiochus’ “consensus view” lies quite close to Plato’s (as described above), but he papers over differences among his view, Aristotle’s view, and the Stoics’ view on the role of bodily and external goods in happiness. Antiochus’ view of Aristotle is understandable, though, especially since the Aristotelians of his day did hold the view that he attributes to Aristotle.

5. Epicureanism

The views canvassed above all accept that living well consists in virtue or virtuous activity. (Though the Academics are skeptics, they reliably seem to find this sort of view more plausible than the alternatives.) Another kind of ancient ethical theory says that living well consists in pleasure; the most important such view is Epicureanism.

Although they are outliers in other ways, the Epicureans operate from standard constraints on our final end: we do everything else for its sake, and we do not seek it for the sake of anything else. They use several approaches to defend their claim that the final end of all our actions is pleasure. First, they say that pleasure’s goodness is evident in perception and need only be pointed out, not argued for—much as we need not argue that fire is hot, since its heat is evident in perception. Second, like the Stoics, the Epicureans offer a version of the cradle argument. Where the Stoics say that the newborn’s first, uncorrupted impulse is for the exercise and preservation of herself and her constitution, the Epicureans say that she goes for pleasure. Finally, some Epicureans responded to arguments against hedonism. Sadly, no direct replies to the best anti-hedonist arguments of antiquity survive, but we do have some attempts to explain why many people deny the obvious truth of hedonism.

In one way, bodily pleasures and pains have a special role in the Epicurean view: all other pleasures and pains must be “referred to” them, directly or indirectly. For example, worry about losing one’s job might be referred directly to pains of hunger and physical exposure (because the job pays for food and shelter). Worry about what the boss thinks might be referred to worry about losing one’s job, and indirectly to the same bodily pains. This can be repeated indefinitely; perhaps one’s worry about proper clothing is referred to what the boss thinks, and so on. The key claim is that all psychological pleasures and pains must ultimately be referred back to the body. Plato and others, in contrast, say that we have basic non-bodily pleasures and pains—for example, shame at one’s bad reputation, or pleasure when one learns something new, just by themselves.

In another way, though, psychological pleasures and pains have a special role: they have greater magnitude than bodily pleasures and pains. On this point, the Epicureans actually agree with Plato and others above. However, they explain the comparative magnitudes in a different way: the body only registers what is happening right now, while the soul ranges over past, present, and future. The soul thus represents to itself a much larger array of pleasures and pains, and can feel more pleasure and pain than the body can at a moment. (Here the Epicureans disagree with their hedonist predecessors, the Cyrenaics, who say that bodily pain is used as punishment because its magnitude is greater than pain of the soul.)

The other most important Epicurean thesis about pleasure and pain is their denial that there is any neutral hedonic state in which one experiences neither pleasure nor pain. (On this point, they disagree with both Plato and the Cyrenaics.) If there is no neutral hedonic state, then complete removal of pain obviously cannot culminate in the neutral state; the condition in which one is completely free of pain must be pleasure. In fact, once pain is removed, they say, pleasure cannot be intensified, in either the body or the soul. Because psychological pleasures are greater than bodily pleasures, freedom from disturbance of the soul (ataraxia) is the key determinant of happiness, more important than freedom from bodily pain (aponia). Thus, any bodily pain can be outweighed by the pleasure of freedom from disturbance, and the Epicurean sage can live well in any external circumstances, even on the rack. Ataraxia (sometimes translated “tranquility”) requires three main subsidiary achievements: freedom from fear of death, freedom from fear of the gods, and freedom from excessive desire.

Epicurean arguments that death is not fearful continue to attract a great deal of attention from contemporary philosophers. The Epicureans argue that death is the end for us; we are not immortal. Then—and this is where contemporary discussion usually begins—being destroyed cannot harm us, for two reasons. First, when we are dead, we perceive nothing, and only what we perceive can harm us. (Some people object: things we do not perceive can harm us, as when a friend betrays us but we never find out.) Second, when we exist, we are not yet dead, so death cannot harm us while we are alive. Once we are dead, we no longer exist, so death cannot harm us when we are dead either. The second argument can be developed in various ways. The Epicurean poet Lucretius asks whether we were harmed by our pre-natal non-existence, and argues that if we were not, then our post-mortem non-existence also will not harm us. (Some people object: we can be harmed when we do not exist, as when a project that we care about and work hard to support fails after our death. Nothing pre-natal could harm us in this way.) One important clarification: as we shall see, the Epicureans think it is (usually) natural to try to avoid death. However, trying to avoid death does not entail fearing it, any more than we must fear getting our shoes wet in order to avoid getting our shoes wet.

The Epicureans try to remove fear of the gods by appealing to the concept of divinity: gods are immortal and blessed. But perfectly blessed gods can neither be benefited nor harmed by others (including human beings). So, they will never be grateful to human beings for benefiting them or angry at human beings for harming them. Therefore, the phenomena popularly ascribed to divine agency—for example, thunderbolts, seen as expressions of divine anger—cannot be explained that way. To vindicate this claim, they offer scientific accounts of the world solely in terms of the basic principles of atoms and void.

Finally, the Epicureans divide desires: some are natural and others are not. The former are grounded in actual human needs; the latter (for example, the desire to have statues erected in one’s honor) are not. Among the natural desires, some are necessary and others are not. Unnecessary natural desires are grounded in actual human needs (they are natural), but they aim to meet that need in a particular way, even though it could be met in many other ways. For example, caviar can meet the human need for food, so desire for caviar is natural. But our need for food can be met in many ways, so the desire for caviar is not a necessary desire. Natural and necessary desires are for the proper objects of genuine human needs. There are three kinds of natural and necessary desires, depending on what they are necessary for: happiness, freedom from bodily pain, and life. This division is fairly clear: we need some things to stay alive, and desires for those things are natural and necessary. But we could be alive and in severe bodily pain, which is naturally bad for us. So, desires for what we need to remove bodily pain are also necessary—for example, food and drink in general (but not caviar and champagne specifically). Further, we can be alive and free from bodily pain but still miserable, because our minds are troubled. Thus, we also have natural and necessary desires for what can remove mental trouble: virtue and friendship.

Several virtues can be treated fairly quickly. Courage is the state in which one is free from irrational fear of death and the gods (which also requires piety). Temperance is the state in which one has natural desires and abandons unnecessary desires whenever circumstances make it difficult to eat (say) caviar instead of barley. Wisdom is knowledge of death, the gods, desires and pleasures, and the basic structure of the cosmos; it instills piety, courage, and temperance. That leaves the most interesting virtue for the Epicureans, justice, which has both social and personal aspects. Socially, justice is a useful agreement—in particular, an agreement to neither harm nor be harmed. For an agreement to be just, it must actually be useful. Which agreements are useful (and so just) varies, so different agreements are just in different circumstances. Still, the core concept of justice as a useful agreement does not change. Next, there are two accounts of why personal justice is important. First, even if one can get away with violating just social agreements, one cannot be sure that one will get away with it. So, violating just social agreements causes fear. Fear is a psychological pain; since such pains are greater than bodily pains, whatever material goods one hopes to gain by violating a just social agreement cannot compensate for injustice’s cost in fear. Second, whatever one might hope to gain through injustice will not be necessary for life, health, or tranquility. Since the sage is temperate, she desires only what is necessary to life, health, and tranquility. Such limited goods are (usually) easily obtained. So, the sage has no incentive to violate just social agreements. Whenever extreme circumstances might seem to give an incentive, we should reconsider whether the original agreements are genuinely useful in those extreme circumstances, and so whether the agreements are still just.

Lastly, Epicurus praises friendship for its ability to make us tranquil. It is tricky to say how friendship and justice differ. Epicurus says justice is an agreement neither to harm nor be harmed, which suggests a possibility: justice seeks mutual avoidance of harm—not only by not harming one another, but also by assisting each other in not being harmed. Friendship goes beyond that; it requires mutual benefit. But what kind of benefits? Friends help each other when necessary, and Epicurus agrees that this is one benefit of friendship. But more important for our tranquility is our confidence that we will have help from our friends in the future, if we need it. This includes not only help with mundane tasks like moving, or momentous ones like providing for one’s children after one dies; the Epicureans actually formed a sort of commune near Athens, and dedicated their days to philosophical therapy through an elaborate set of confessional practices. Thus, friends help each other to achieve the highest good (tranquility) by helping each other to achieve its necessary means (virtue).

6. Pyrrhonism

Pyrrho himself is a nebulous figure, but in the wake of the Academy’s later skeptical turn (see above), Aenesidemus revived his legacy by using him as a figurehead for a different skeptical tradition. The differences between Academics and Pyrrhonists are not always easy to discern. Our main source for Pyrrhonism, Sextus Empiricus, says there are three kinds of philosophers: dogmatists (who claim to have grasped the truth), Academics (who say the truth cannot be grasped), and Pyrrhonists (who are still inquiring). Thus, Sextus effectively characterizes Academics as dogmatists who claim to have grasped one truth. However, his classification does not withstand scrutiny. The Academics follow persuasive appearances, and any claim that the truth cannot be discovered may be understood as what is plausible after extensive inquiry (not: what they claim to grasp as the truth). As we shall see, the Academics use of persuasive appearances is not far from what the Pyrrhonists say and do.

Still, there is a clear difference in the ethical attitudes taken by Academics and Pyrrhonists. The Academics typically say that something like the Aristotelian or Stoic view—that virtue and virtuous activity are the highest or only goods—is plausible. The Pyrrhonists say that their end is tranquility (again, ataraxia). This places their ethical attitude closer to the Epicureans, though their recipe for tranquility is rather different. (Here it is worth noting that later Roman Stoics also emphasized tranquility in a way that the early Stoics did not.)

We must work up to that point by considering the development of a young Pyrrhonist. First, she notices that different appearances often make incompatible reports. (The wind seems warm to her and cold to another; cremating the dead seems respectful to her and disrespectful to another.) Aenesidemus listed many ways that appearances can disagree, the “Aenesideman modes.” Such disagreement or relativity of appearances is puzzling: which appearances reflect how things really are? On topics that we care about, such puzzlement is painful and provokes attempts to remove it by vindicating some appearances over others. That is, puzzlement provokes inquiry into how things really are in themselves, as opposed to how they appear to various subjects.

When the Pyrrhonist inquires, though, she discovers equally strong reasons on both sides of every question. Further, whatever considerations she might appeal to in trying to resolve the dispute are also matters of disagreement, requiring more inquiry, and so on. The state in which one finds equally strong reasons on both sides of an issue is “equipollence”; the Pyrrhonist responds to equipollence by suspending judgment on which appearances reflect how things really are. When she does so, the pain that she felt at being puzzled dissolves. Sextus offers a simile: Apelles was trying and failing to paint the froth on a horse’s mouth. In frustration, he threw his sponge at the canvas; fortuitously, it produced the desired effect. Likewise, the budding Pyrrhonist wants to rid herself of troubles about the real nature of things by discovering the truth. She never finds reasons for any particular view better than the reasons on the other side. So, she suspends judgment. But when she does, she fortuitously achieves the end she sought: tranquility. As mentioned above, though, she does not rest on her laurels at this point; rather, she keeps inquiring.

Like Academics, Pyrrhonists must explain how they act. The Pyrrhonist criterion of action is the appearance. We can approach this through the examples of relativity above. When the wind seems warm to one person and cool to another, and they have equally strong reasons to trust each appearance, they might suspend judgment on the question whether the wind is really warm or cool. But this does not remove the appearances; the wind still seems cool to one and warm to the other. It also does not prevent either from acting on her appearance. One might put on another layer of clothing, while the other takes one off. Likewise when two people disagree whether it is respectful to cremate the dead. We might find equally good reasons to say that cremation is respectful and that it is disrespectful. But it may still seem respectful to one person and disrespectful to the other, and nothing prevents each person from acting on how things seem to them. (It is an open question whether this will produce toleration of different opinions or simply make practical disputes irresolvable.) It is unclear exactly how much the Pyrrhonist criterion of action, the appearance, differs from the Academic criterion, the plausible appearance. For example, both Pyrrhonists and Academics follow their traditional religious practices, which suggests some convergence in how ancient skeptics of different stripes deal with action.

Again, their clearest difference concerns the final end. Naturally, the Pyrrhonists do not dogmatically assert that tranquility is the end; it simply seems to them to be the end, and they act based on that appearance. But they say more about why Pyrrhonism seems to be the best path to tranquility—better than Epicureanism, for example. Certain appearances and feelings are unavoidable for us: hunger seems painful and leads us to relieve it. There is no getting rid of these appearances and feelings. However, those who dogmatically assert that pain is bad (for example) face a double dose of pain. They feel not only the inevitable pain of hunger, but also the further pain of mental trouble on reflecting that they possess something that is (by their lights) really bad for them. The Pyrrhonist, however, suspends judgment on the question whether the pains of hunger are really bad for her. Thus, she maintains her tranquility even in the face of life’s inevitable nuisances.

 

7. References and Further Reading

a. Primary Works

  • J. Annas and R. Woolf, Cicero: On Moral Ends (Cambridge: Cambridge University Press, 2001).
    • Cicero presents the ethical views of the Epicureans, Stoics, and Antiochus, and disputes them with reference to Carneades’ division of ethical theories.
  • C. Brittain, Cicero: On Academic Scepticism (Indianapolis: Hackett Press, 2006).
    • Our main source of information about the Stoic–Academic debate and the development of the Skeptical Academy.
  • J. Cooper, Plato: Complete Works (Indianapolis: Hackett Press, 1997).
  • R. Crisp, Aristotle: Nicomachean Ethics (Cambridge: Cambridge University Press, 2000).
  • M. Griffin and E. Atkins, Cicero: On Duties (Cambridge: Cambridge University Press, 1991).
    • Cicero adapts and extends Panaetius’ work on appropriate actions.
  • B. Inwood and L. Gerson, Hellenistic Philosophy (Indianapolis: Hackett Press, 1998).
    • An excellent source book.
  • A. Long and D. Sedley, The Hellenistic Philosophers (Cambridge: Cambridge University Press, 1987).
    • Another excellent source book; v.1 contains translations, while v.2 contains the texts translated (and sometimes more) together with a substantial bibliography.

b. Secondary Works

  • K. Algra, et al., The Cambridge History of Hellenistic Philosophy (Cambridge: Cambridge University Press, 2000).
    • A series of essays by various authors on central topics, and contains an extensive bibliography.
  • J. Annas, An Introduction to Plato’s Republic (Oxford: Oxford University Press, 1981).
  • J. Annas, The Morality of Happiness (Oxford: Oxford University Press, 1993).
    • Influential overview of the ethical theories of Aristotle and the main Hellenistic schools.
  • J. Annas, Platonic Ethics, Old and New (Oxford: Oxford University Press, 2000).
    • Argues for a Stoicized interpretation of Plato’s ethics by reference to Middle Platonist readings of Plato.
  • J. Barnes, The Toils of Skepticism (Cambridge: Cambridge University Press, 1990).
  • T. Brennan, The Stoic Life (Oxford: Clarendon Press, 2005).
  • S. Broadie, Ethics with Aristotle (Oxford: Oxford University Press, 1991).
  • G. Fine, Plato 2: Ethics, Politics, Religion, and the Soul (Oxford: Oxford University Press, 1999).
    • A collection of essays, including many classics.
  • T. Irwin, Plato’s Ethics (Oxford: Oxford University Press, 1995).
  • R. Kraut, Aristotle on the Human Good (Princeton: Princeton University Press, 1989).
  • P. Mitsis, Epicurus’ Ethical Theory (Ithaca: Cornell University Press, 1989).
  • M. Nussbaum, The Fragility of Goodness (Cambridge: Cambridge University Press, 1986).
    • A study of moral luck in Greek tragedy, Plato, and Aristotle.
  • M. Nussbaum, The Therapy of Desire (Princeton: Princeton University Press, 1994).
    • Essays on ethical theory and therapy in Aristotle and Hellenistic philosophy.
  • T. O’Keefe, Epicureanism (Berkeley: University of California Press, 2009).
  • G. Lear, Happy Lives and the Highest Good (Princeton: Princeton University Press, 2004).
    • A study of the relationship between ethical and theoretical virtues in Aristotle.
  • A. Rorty, Essays on Aristotle’s Ethics (Berkeley: University of California Press, 1980).
    • A collection of essays, including many classics.
  • H. Thorsrud, Ancient Skepticism (Berkeley: University of California Press, 2008).

Author Information

Clerk Shaw
Email: jshaw15@utk.edu
University of Tennessee
U. S. A.

Zhou Dunyi (Chou Tun-i, 1017-1073)

Zhou DunyiZhou Dunyi (sometimes romanized as Chou Tun-i and also known by his posthumous name, Zhou Lianxi) has long been highly esteemed by Chinese thinkers.  He is considered one of the first “Neo-Confucians,” a group of thinkers who draw heavily on Buddhist and Daoist metaphysics to articulate a comprehensive, Confucian religious philosophy.

This article begins with a brief look at Zhou’s life and historical context before turning to a detailed examination of his major writings.  It then looks at major themes in Zhou’s work as well as a few important philosophical concerns that his writings address.  Finally, it turns to Zhou’s legacy and influence, providing information on additional readings for further study of Zhou’s thought.

Zhou combines deep spirituality with an emphasis on morality and politics. He places this humanistic ideal within a cosmic vision wherein the forces of creation find their fullest expression in human beings.  Essentially, he articulates the common metaphysical framework that informed Chinese philosophy for nearly a millennium.  In his work, Zhou follows earlier thinkers such as Mencius (Mengzi, 372-289 B.C.E.), but, unlike some of his stricter Confucian brethren, Zhou draws heavily on ideas associated with Daoism and Buddhism. This is particularly the case with Zhou’s stress on the primacy of “stillness” (qing) over “activity” (dong) and his strong cosmological orientation.  Moreover, Zhou’s temperament seems marked more by Buddhist notions of equanimity and compassion than stereotypical Confucian formality and restraint.  For these reasons, Zhou remains an intriguing yet controversial figure.

According to Zhu Xi (1130-1200), perhaps the most eminent early Neo-Confucian thinker, Zhou was the first sage since Mencius and a key figure in the “new transmission” of the Confucian Way (Dao). Zhou transmitted the Way to the Cheng brothers, Cheng Hao (1032-1085) and Cheng Yi (1033-1107), who then transmitted the Way to Zhu himself.  In this view, Zhou is the “founding ancestor” of Zhu Xi’s school of Neo-Confucianism, a philosophical system that profoundly informed East Asian societies in the Middle Ages. Zhou’s best known works are the “Explanation of the Diagram of the Supreme Polarity” (Taijitu shuo), and Penetrating the Classic of Changes (Tongshu), both of which are included in the Zhouzi Quanshu (Collected Works of Master Zhou).  Zhou also wrote a short poetic essay, “On the Love of the Lotus” (Ai lian shuo), that is part of the standard secondary school curriculum in contemporary Taiwan.

Table of Contents

  1. Life and Context
  2. Works
    1. “Explanation of the Diagram of the Supreme Polarity”
    2. Penetrating the Classic of Changes
    3. “On the Love of the Lotus”
  3. Key Concepts
    1. Fundamental Unity within Diversity
    2. Human Nature
    3. Authenticity as Humanity’s Ethical and Ontological Basis
    4. Inseparability of Ethical Life from the Workings of the Cosmos
    5. Sageliness as Ideal for Daily Life
  4. Principal Concerns
    1. Lineage
    2. Daoist and Buddhist Influences
    3. Criticism of Other Thinkers
    4. Quietism
    5. The Problem of Evil
  5. Legacy
  6. References and Further Reading

1. Life and Context

Much of what we know of Zhou’s life comes from the Song Shi (History of the Song Dynasty), as well as anecdotes preserved in Reflections on Things at Hand (Jinsi lu), the anthology of Song-era Confucian treatises compiled by Zhu Xi with the help of the historian Lü Zuqian (1137-1181).  Totaling some 622 passages culled from the writings of key thinkers (along with Zhu Xi’s comments), this book ranks as one of the most important works of Chinese philosophy.

Zhou was born in Daozhou (modern-day Hunan) into a family of scholar-officials. His “style name” was “Maoshu.”  Originally, his personal name was “Dunshi,” but due to the taboo against using the name of the emperor (a widely observed practice in traditional China), Zhou’s name was changed to “Dunyi” when Emperor Yingzong ascended to the throne in 1063. When Zhou was 14 years old, his father passed away but he was adopted by his maternal uncle, Zheng Xiang.  It was through his uncle’s work that Zhou attained his first governmental post.  During his career, Zhou served as district keeper of records, magistrate of various counties, and assistant prefect. Traditional accounts  say that he was quite diligent in his duties, earning high praise from his colleagues and superiors; yet Zhou refused to participate in the civil service examination system, the typical route by which bright and capable men gained access to the elite levels of Song society.  As a result, Zhou never held a high governmental position nor attained the coveted “presented scholar” (jinshi) degree, the highest rank and a virtual necessity for attaining an influential post.

Towards the end of his life, Zhou fell ill and was transferred to Xingzi in Jiangxi province, where he settled near the foot of Mount Lu, one of China’s sacred mountains.  Here he built a retreat along a tributary of the Pen River, naming it Lianxi (“Stream of Waterfalls”) after a stream in his home village; later generations honored Zhou by calling him “Master Lianxi” after his beloved study. Zhou resigned from office in 1071, passing away about eighteen months later.  During his lifetime, Zhou was not well known, even though he briefly tutored both Cheng Hao (1032-1085) and Cheng Yi (1033-1107) when they were young.  His contemporaries, however, revered him for his warm personality and intuitive insight into the Way of Heaven.  Later Neo-Confucians came to regard him as an exemplar of “authenticity” (cheng), much like Confucius’ disciple Yan Hui.  In 1200, Zhou was posthumously dubbed Yuangong (“Duke of Yuan”) and in 1241 was honored in the sacrifices performed in the official Confucian temple.

Zhou lived during the Northern Song (960-1126), the “second golden age” of Confucianism.  The initial impetus for this Confucian renaissance came from late Tang Confucian thinkers such as Han Yu (768-824), Li Ao (772-836), and Liu Zhongyuan (773-819). They were highly critical of Buddhism and advocated for a return to what they considered the true source of Chinese civilization (in Zhu Xi’s words, “this culture of ours”), a heritage enshrined in the Classical Confucian texts.  After the collapse of the Tang and the eventual rise of the Song dynasty, Confucianism became the guiding Way and, just as in the Han dynasty (206 B.C.E.-220 C.E.), anyone seeking an official position had to be schooled in Confucian texts and doctrines.

The Confucian revival in the early Song was by no means monolithic, however, and several prominent thinkers also pursued studies outside of official circles. While looking to Confucian ideas, many of these thinkers investigated and embraced Daoist and Buddhist notions, particularly those pertaining to spiritual self-cultivation.  The ensuing creative tension between these intertwined lines of thought inspired new interpretations of classical texts and pushed Confucianism beyond its traditional boundaries.  Among these thinkers, Zhu Xi singles out a select few as the “Masters of the Northern Song,” a group that included Zhou Dunyi, Shao Yong (1011-1077), Zhang Zai (1020-1077), and the aforementioned Cheng brothers.  While it would be wrong to consider these men as forming an institutionalized school, they were united in the view that a society based on the Way could only be achieved through personal reform grounded in cultivation of the xin (“mind-heart”) to harmonize Heaven, Earth, and Humanity.

2. Works

For such an influential figure, Zhou authored surprisingly few works.  In fact, of the 622 passages in Reflections on Things at Hand, only 12 are by Zhou—far fewer than the number of passages from Zhang Zai and the Chengs.  Most people know Zhou for his essay “Explanation of the Diagram of the Supreme Polarity” (Taijitu shuo) along with his extensive commentary, Penetrating the Classic of Changes (Tongshu).  Both texts focus on cosmology as well as the ethical and spiritual implications of their depictions of the cosmos, and both texts continue to exert tremendous influence on Chinese thought.  In addition, Zhou is credited with “On the Love of the Lotus” (Ai lian shuo), a short poetic essay that, like many such works, reveals unexpected philosophical depths.

a. “Explanation of the Diagram of the Supreme Polarity”

According to tradition, Zhu Xi was so struck by this treatise that he placed it at the beginning of Reflections on Things at Hand, thereby assuring it pride of place in Neo-Confucian thought. Broadly speaking, it has two main parts: the essay itself, which outlines the evolution of the cosmos, and the accompanying “Diagram” (taiji tu), a graphic illustration of the cosmic process described.

Zhou-Dunyi-graphic

Taiji Tu from an ancient Chinese text

The main theme of the Diagram is simple: the human and cosmic realms are governed by the same norms; the microcosm and the macrocosm correspond perfectly.  Much like earlier Chinese thinkers, Zhou proclaims that human life (including the socio-political realm) is rooted in the Way of Heaven, and that it is the duty of the sage-ruler to ensure that the cosmic and human realms harmonize.  Nonetheless, Zhou presents this cosmology in a particularly powerful manner, prompting later thinkers to consider the “Explanation” a true masterpiece.

A close look at the “Explanation” yields interesting insights.  The treatise can be divided into six parts, each corresponding to certain figures in the Diagram.  Part 1 begins with the mysterious “Non-Polarity” (wuji), the primordial yet indefinite source of all reality, which Zhou identifies with the “Supreme Polarity” (taiji), the core of actual existence.  The taiji gives rise to yin and yang by alternating from stillness to activity and back.  Part 2 picks up with the yin and yang, speaking of how their alternation and combination produces the Five Phases (wuxing: water, fire, wood, metal, earth), which in turn form the basis for the cycles of nature (the Four Seasons).  In Part 3, Zhou circles back to include the wuji and taiji, the “Two Modes” (yin and yang), and the Five Phases, noting that the latter interact and stimulate one another, thus generating the myriad things of our world.

At this point, Zhou has covered the entire Diagram, yet the “Explanation” is only half finished.  With Part 4, he shifts to humanity, which emerges from the cosmic processes and, as such, is governed by both yin and yang which together engender our “five-fold nature.” In Part 5, Zhou turns to the sage, the ideal Chinese ruler, who more clearly perceives and embodies the cosmic forces than the majority of humankind.  Mirroring the cosmic rhythm, the sage addresses and “settles” human affairs through the Confucian virtues of centrality, correctness, humaneness, and rightness while abiding in “stillness.”  Finally, in Part 6 Zhou turns to the Yijing (Classic of Changes), referring to the Sage’s wisdom as one that embraces cosmic and human truths.

Zhou makes liberal use of paradoxical language in the “Explanation,” notably in the first line where he both distinguishes the wuji and taiji yet joins them together. In doing so, Zhou suggests an equivalence, if not actual identity.  Zhou continues in this same rhetorical mode, speaking of the incipient cosmos as both “still” and “active” in its functioning: “Activity and stillness alternate; each is the basis of the other.”  In Part 2, Zhou proclaims that the Five Phases are fundamentally one—“simply yin and yang; yin and yang are simply the taiji”—while each has its own nature.  Part 4 opens by declaring that humans have the “finest and most spiritually efficacious [qi],” thus singling us out for special consideration.  Humans are distinct yet not separate from other beings or the processes of creation.  Part 5 focuses on the sage, a mysterious figure who manages human affairs effortlessly, as though he were the the working of nature.  Finally, Zhou concludes by stating “Great indeed is the Yijing!  Herein lies its excellence!” By closing on this note of awe, Zhou suggests that his treatise proffers a glimpse of the Sage’s cosmic vision.

Zhou’s “Explanation” is simultaneously stirring, enlightening, yet maddeningly mysterious, and this air of mystery is a source of the text’s power.  The mystery deepens as Zhou leads us through the Diagram, largely because he describes rather than explains the various figures, and he is strangely silent on some of the Diagram’s aspects.   The essay, thus, resembles a theological treatise, laying out basic teachings derived from “scripture,” such as the Yijing, and relating them in a coherent way.  In this regard, it resembles the Nicene Creed, a formal statement of core beliefs shared by many traditional Christians.  Like the Creed, Zhou’s “Explanation” assumes its readers are familiar with its ideas, presenting them as “articles of faith” but never arguing for why these things should be the case.

b. Penetrating the Classic of Changes

This work comprises forty chapters in all yet since each chapter is only a paragraph or so in length, it is still relatively short.  Ostensibly, the title Tongshu comes from Zhou’s insistence that its principles penetrate (tong) and harmonize with the Yijing.  The treatise also draws on the Zhongyong (Doctrine of the Mean), the Shujing (Classic of History), and the Analects. It is likely that the “Explanation” was originally the last section of the Tongshu but that Zhu Xi moved it to the beginning; eventually, it became an independent work due to its importance in Neo-Confucian thought.

The treatise’s main themes are central to the Neo-Confucian project: the necessity of authenticity (cheng) in attaining Sageliness, and how to enact Sageliness in accord with the cosmos to establish the true Way (Dao). Zhang Boxing (1652-1725), who compiled the Complete Collection of Zhou Dunyi’s Writings (Zhou Lianxi xiansheng quanji), divides the Tongshu into two parts, each comprising 20 chapters.  Certain ideas and concerns link various chapters but a detailed presentation lies beyond the scope of this entry. Instead, this overview highlights key points and includes quotes to provide a sense of Zhou’s voice and style.

Part 1: Tongshu, Chapters 1-20

The first half of the treatise begins with a stirring proclamation: “Being authentic is the foundation of the sage.” Over the next few chapters, Zhou then touches on several traditionally Confucian topics: the importance of moral virtue, the necessity of learning, how to govern properly, and so forth.  Not surprisingly, Zhou grounds each of these human concerns in the workings of the cosmos, much as we have seen with the “Explanation.” However, there are a few points in this first half that make the Tongshu rather unique and thus warrant close attention.

Chapters 7-10, for instance, consist of questions from unnamed students and Zhou’s replies, thereby rhetorically underscoring the essentially pedagogical and dialogical nature of Confucianism.  Hearkening back to the example of Confucius, the text presumes that the reader is engaging with the teachings as if face-to-face with the teacher, the “old model,” (laoshi) who, in this case, is Zhou himself.  Chapter 7 (appropriately entitled “The Teacher”) opens: “Someone asks: ‘Who makes all under Heaven good?’” Reply: “The teacher.” Question: “What do you mean?” Reply: “[He is one whose] nature is simply in equilibrium between firm and yielding good and evil.” Over the next few chapters, the Teacher reminds us of the good fortune at being able to correct our errors, the importance of thinking as an activity rooted in our primal authenticity, and stresses devotion to learning as we progress towards Sageliness.

Chapters 17-19, on the other hand, deal with what might seem to be a minor consideration: music, and, by extension, the “arts” in general. However, this topic is, in fact, central to Confucianism, which consistently upholds the importance of cultural refinement (wen) as part of the Way.  Echoing words from Confucius himself, Zhou speaks of music as a positive influence on people, helping attune them to each other.  Thus he says in chapter 17, “[The ancient sages and kings] created music to give expression to the airs of the eight [directional] winds and to pacify the dispositions of all under Heaven.”  Not only does music attune the mind-hearts of all people, it also harmonizes us with animals and spiritual beings.  We see in this short section the inseparability of the aesthetic, ethical, and spiritual dimensions of sagely learning.

Chapter 20 both summarizes Zhou’s points so far and leads us into the next half.  It is fitting, then, that the chapter is entitled “Learning to be a Sage,” and, like chapters 7-10, it is a dialogue between student and teacher.  The teacher explains the essentials of the Way of the sage, saying, “Unity is essential. To be unified is to have no desire. Without desire one is unoccupied when still and direct when active. Being unoccupied when still, one will be clear; being clear one will be penetrating. Being direct in activity one will be impartial; being impartial one will be all-embracing. Being clear and penetrating, impartial and all-embracing, one is almost [a sage].”

The Daoist flavor in the first half of the Tongshu is unmistakable, but Zhou is not suggesting that the sage observes the world with an empty mind. Rather, he observes that striving for sageliness means uniting all of one’s faculties.  Rooted in one’s true nature, undistracted by wayward desires, and unoccupied by selfish lusts or passing whims, one is directly involved with all things.  One can then see clearly and thus respond appropriately.

Part 2: Tongshu, Chapters 21-40

This second half of the Tongshu shifts from the more metaphysical stance of the first part to a more explicitly ethical orientation.  Zhou starts off in a typically Confucian fashion by focusing on governing society, stressing the importance of being “impartial” (gong) —scrupulously avoiding selfishness—in order to attain “clarity” (ming). Much like the “Explanation,” the first few chapters correlate moral virtue with the cosmic processes of yin, yang, and the wu xing.  Following contemporary Confucian Tu Weiming, we can say that Zhou articulates an “anthropocosmic” vision here.  However, the references to the Zhongyong as well as the necessity for intelligence in perceiving truth remind us that metaphysical knowledge is but the first step towards enacting the Way.

One of the most interesting things in this second half of the Tongshu is the central role played by Yan Yuan (Yan Hui), Confucius’ most mystically-inclined disciple. At one point, for instance, Zhou exalts Yan Hui’s example: “Seeing what was great, his mind was at peace. With his mind at peace, nothing was insufficient. With nothing insufficient, then wealth and honor, poverty and humble station were all the same [to him]. Being all the same, then he was able to transform and equalize [others, that is regard others as equal]. Thus Yanzi [ Yan Hui] was second only to the Sage [Confucius].” Zhou underscores this same point a little later on in chapter 29 where Zhou exalts Confucius’ “comprehensiveness,” after which he immediately praises Yan Hui as the only one who was able to discern this quality and model it for succeeding generations.

While the entire Tongshu draws on the Yijing, it focuses most explicitly on that work in the last 10 chapters.  Much of this section reads as if Zhou were leading the reader through a ritual consultation of that most enigmatic of Chinese classics, referring to hexagrams #1, 24, 25, and 37, among others. Furthermore, it quotes passages from the Xici (“Appended Remarks”), the most philosophically rich section of the Yijing.

Zhou begins the last section of the treatise (chapters 37-40) very simply, invoking the cosmic basis of the sagely Way and stressing the sage’s impartiality while recalling the pedagogical dialogue of earlier sections: “The Way of the sage is perfectly impartial,” I said. Someone asked, “What does that mean?” I replied, “Heaven and Earth are perfectly impartial.” Finally, in the very last chapter, Zhou concludes the Tongshu by giving guidance towards the sagely Way through lines of the Yijing.  The last few sentences warrant special attention: “Be cautious! This means [to follow] the ‘timely mean’!  ‘Keep the back still,’ for the back is not seen. When still, one can stop [at the right point]. To stop is not to act [deliberately]. To act [deliberately] is not to stop [at the right point]. This Way is profound!”

All told, the Tongshu is a rich, evocative text, appropriately mirroring the mysterious and compelling wisdom of the Yijing.  Zhou’s elusive yet allusive style draws on multiple sources, encouraging the reader to make connections between the different sections and events within her own life. While revealing an inspiring cosmic vision, however, it continually reminds readers that its truth can only be realized when enacted daily.

c. “On the Love of the Lotus”

While not a philosophical treatise, “On the Love of the Lotus” (Ai lian shuo) remains Zhou’s most beloved work and reveals surprising spiritual depths.  According to tradition, Zhou composed the poem in 1071 after he built his retreat, Lianxi, at the foot of Mount Lu.  As was common practice among retired literati (Chinese scholar-bureaucrats), he dug a pond in front of his study and planted it with lotus blossoms, spending much of his leisure time contemplating the scene.

“On the Love of the Lotus” totals some 119 characters in addition to its title, arranged in eleven lines.  Each of the lines is a couplet of verses, varying in length.  Zhou wrote this piece in the gu wen (“ancient writing”) style, a literary style hearkening back to the elegant prose of the Han dynasty.  This style had become increasingly common during the Confucian Renaissance, was a favorite of the late Tang Confucian critic Han Yu, and contrasts with the “parallel prose” style  that had dominated Chinese prose previously with the latter’s very strict meter and rhyme scheme. During the Song era, gu wen became the style of choice among the literati, and was a rhetorical signal that the writer and reader were dealing with a work of “special writing” concerning high-minded ideals, versus low or vulgar subjects, much as “the King’s English” functioned in the British Empire during the 19th and 20th centuries. A mark of education and culture, gu wen was still accessible to a degree by members of the lower classes, and thus exemplified the power of wen as a culturally binding force among the Chinese populace.

On the surface, “On the Love of the Lotus” is Zhou’s heartfelt ode to the flourishing blossoms in his garden, evoking the serene presence of flowering chrysanthemums, peonies, and lotuses, each with its distinctive aura and beautiful form.  Yet, the piece hints at subtle depths of meaning, pointing to the anthropocosmic vision that Zhou so explicitly discusses in his other works. For Zhou, the lotus exemplifies the cosmic/spiritual harmony that we should all seek.  Thus he says, “Inside, it is open; outside, it is straight (zhi)” – a line recalling the time-honored Chinese ideal of Dao.  Zhou contrasts this with the chrysanthemum, which is the “recluse” among the flowers, and the peony, which he speaks of as “wealthy,” or showy, gaudy, and appealing to the masses.  The lotus, on the other hand, is the “gentleman among flowers.”  The term “gentleman” (junzi), of course, has since the time of Confucius been the ideal human being.

Like other Chinese literary works, “On the Love of the Lotus” draws on cultural tropes shared by Confucian, Daoist, and Buddhist traditions.  This is most obvious with the image of the lotus itself.  As Zhou writes, “I love only the lotus, for rising from the mud yet remaining unstained; bathed by pure currents and yet not seductive.” “On the Love of the Lotus” pulses with subtle yet powerful symbolism, evoking a deep, tranquil mood while encouraging a dynamic and attentive state of awareness.   It thus gives a glimpse of the sagely mind itself.

3. Key Concepts

Zhou’s works, while creative and eclectic in nature, establish the basic parameters of Neo-Confucian philosophy. While he never articulates a full-fledged system, most of the concepts he discusses support each other.  This overview, therefore, looks at key themes running through Zhou’s writings, explaining what they entail and how they connect to each other.

a. Fundamental Unity within Diversity

A perennial issue in philosophy as expressed in all cultures is the relationship between the myriads of phenomena in the world, which are diverse and seemingly constantly changing, and the underlying unity and stability within this vast whole.  A pond is filled with dozens of lotus blossoms, each distinct and with its own unique hue, some in bloom while others wither.  Yet all seem to embody the same “lotus-ness,” and each specific blossom remains its own, separate self throughout its life cycle.  Similarly, our world is peopled with thousands of different human beings, and every single person has his or her own unique background, thoughts and feelings.  And yet, each person’s life follows a similar pattern and each person embodies the same “human-ness.”  What is the relationship between the oneness and many-ness that characterizes our world?  This problem, the problem of “the One and the Many,” lies at the heart of many of the world’s philosophies, from the Pre-Socratics of ancient Greece, such as Thales, Anaximander, Heraclitus, and others, to the nameless ṛṣis who composed the Upaniṣads, to the various thinkers of classical Chinese civilization.  While answers have varied, most solutions assume that the world is “one thing” and so there has to be a unifying aspect to the obvious diversity.

For Zhou Dunyi, the answer is that a fundamental unity encompasses the myriads of things, including human beings.  This unity, however, does not consist in some static metaphysical mush wherein all things collapse into a formless One, nor some immaterial Divine Being (“God”).  Rather, this unity is a dynamic, integrated system in which all things function together.  We can see this clearly in the Tongshu, chapter 22, where Zhou succinctly summarizes the cosmic process: “The two [modes of] qi and the five phases transform and generate the myriad things. The five are the differentia and the two are the actualities; the two are fundamentally one. Thus the many are one, and the one actuality is divided into the many.”

Despite such an all-encompassing metaphysical scheme, Zhou maintains the decidedly human focus typically associated with Confucianism, offering an anthropocosmic vision in which the root metaphors for understanding humanity itself are drawn from the workings of Nature.  We can see this most clearly in Zhou’s “Explanation,” where he quotes from the commentary section of the Yijing: “the sage’s virtue equals that of Heaven and Earth; his clarity equals that of the sun and the moon; his timeliness equals that of the four seasons.” In this passage Zhou describes the Way of the sage, the ideal of humanity, in explicitly cosmological terms.  Rhetorically, the message is clear: the Way of humanity is the Way of the cosmos.

Students of Chinese thought may recognize in Zhou’s metaphysical vision yet another variant of the notion of humanity forming a triad with Heaven and Earth, perhaps best expressed in the statement, “the unity of Heaven and Humanity” (tianren heyi).  This harmonious unity of human beings and the cosmos lies at the center of Zhou’s philosophy and draws quite explicitly on earlier Confucian thinkers, notably Dong Zhongshu (c. 195-105 B.C.E.).  In some respects, Zhou Dunyi merely expands upon this basis by borrowing insights from Buddhism and Daoism which he integrates into Confucian tradition. Human beings, along with all other natural phenomena, are integral parts of a larger whole, and in Zhou’s view, we can see this teaching both metaphysically and ethically.  Julia Ching suggests that, under Buddhist influence, this idea transformed into the increasingly abstract adage “The Ten Thousand Things are One” (wanwu yiti), and although we can see aspects of such “pantheism” in Zhou’s writings, he never advocates pure withdrawal into metaphysical contemplation; for Zhou, embracing the actual embodied situation trumps mystical wonder.

Not surprisingly, this insistence on a non-dual unity-cum-diversity defies clear articulation.  As with many mystical philosophers (for example, Zhuangzi, Huineng, Pseudo-Dionysius, Śaṅkara, Ibn Arabi, and others), Zhou often resorts to the language of paradox.  Perhaps the most famous example is in the opening words of the “Explanation”: wuji er taiji (“Non-Polar(ity), and yet Supreme Polarity!”).  This most curious of lines is comprised of a negation and a positive affirmation linked by a conjunction.  Grammatically, this phrase both distinguishes the wuji and taiji yet joins them together in some sort of identity. This simultaneous identity and difference echoes chapter one of the Daodejing: ce liang zhe tong chu er yi ming (“these two [wu – ‘non-being’ – and you –‘being’] interpenetrate, yet, after emerging, differ in name”).  Zhou resorts to paradox elsewhere in his writings as well.  Rhetorically, paradoxical language poses difficulty for rational understanding.  No doubt this more mystical dimension of Zhou’s work has encouraged interpretations that emphasize his debts to Daoism and Buddhism.

The paradoxical harmonious unity of humanity and the larger cosmos also shows in Zhou’s discussion of “stillness” and “activity.”  As the second line of the “Explanation” reads: “The Supreme Polarity in activity generates yang; yet at the limit of activity it is still.  In stillness it generates yin; yet at the limit of stillness it is active.  Activity and stillness alternate; each is the basis of the other.” Much like yin and yang, so cosmic stillness and activity are complementary opposites, not antithetical, but rather co-entailing each other.  This cosmic pattern forms the model for the sage as well, who remains still in the midst of activity but also active while keeping still.  Such active stillness and still activity expresses the fundamental dynamism governing existence as a whole.

One issue that arises with Zhou’s notion of unity within diversity is whether he is speaking strictly cosmologically, concerning the “physical” functioning of the reality, or metaphysically, concerning the ultimate structure of the cosmos. Zhou’s writings are ambiguous on this point, and lend themselves to both readings.  A. C. Graham, however, argues that Zhou is speaking cosmologically, and that the tendency to read Zhou metaphysically is due to Zhu Xi’s reading in which he equates the taiji with li (principle).

Zhou provides a subtle way to understand the psychological dimension of such unity when he speaks of “impartiality” (gong).  One who is “impartial” remains unswayed by petty desires, and thus can respond to any situation without complications.  As Zhou says, “Being direct in activity, one will be impartial; being impartial one will be all-embracing.”   There is no sense of withdrawal, but rather an active embracing of existence.  Moreover, such engaging with the world at large is the sage’s Way, a state that mirrors the cosmos.

b. Human Nature

Zhou’s anthropocosmic vision, centering as it does on the unity of Heaven, humanity, and all things, entails a specific notion of human nature (xing).  Indeed, discussion of human nature is one of the hallmarks of Neo-Confucian tradition. Unlike the Chengs and Zhu Xi, Zhou does not explicitly spell out his view of human nature, but we can infer quite a lot from his writings.

Zhou never uses the actual term xing in the “Explanation” but he mentions it several times in the Tongshu.  Much like what we see in Mencius and the Zhongyong, Zhou implies that the nature of human beings is endowed by Heaven and is fundamentally good. Zhou once more turns to the Yijing: “”The alternation of yin and yang is called the Way. That which issues from it is good. That which fulfills [ or constitutes] it is human nature.” Zhou closes this important chapter on a particularly reverent note: “Great indeed is change, the source of human nature and endowment!”  Further on in chapter 3, Zhou extolls behavior in accord with the Five Constant Virtues (humanness, righteousness, propriety, wisdom, and honesty), observing that “One who is by nature like this, at ease like this, is called a sage. One who recovers it and holds onto it is called a worthy.”   

Clearly, Zhou espouses the Mencian view of human nature as innately good.  Human beings are naturally moral creatures.  However, there is a tension in Zhou’s philosophical anthropology, in that the distinction between good and evil does not reside at the primary level of cosmic origin. As he states in the “Explanation”: “Only humans receive the finest and most spiritually efficacious [qi]. Once they are formed, they are born; when spirit [shen] is manifested, they have intelligence; when their fivefold natures are stimulated into activity, good and evil are distinguished and the myriad affairs ensue.”  Similarly, in chapter 3 of the Tongshu Zhou cryptically says, “In being authentic there is no [intentional] acting [wuwei]. In incipience there is good and evil.” Here Zhou’s insistence on stillness as cosmically fundamental means that this ultimate level transcends the distinction between good and evil; the latter distinction only arises when human beings begin to interact with actual things.  Later commentators have spilled much ink arguing about what Zhou means.

Broadly speaking, Zhou espouses the cultivation of the “mind-heart” (xin) that became a hallmark of Neo-Confucian religiosity, yet he apparently draws heavily on Daoism.  Certainly Zhou uses terms often associated with Daoist neidan (“inner alchemy”), notably qi, the basic “stuff” of the universe, shen (“spirit”), and even jing (“essence”), although he mentions the latter only once or twice. By contrast, Zhou says quite a bit about shen, which he associates with cognitive abilities.  Thus, as Zhou observes in the Tongshu, “That which ‘penetrates when stimulated’ is spirit (shen).”  Apparently shen lies dormant until it is stimulated by external phenomena, at which point it is activated and “knowing” begins.

The place of qi in Zhou’s view of human nature is vague.  That is, qi is a vital component of human beings and all things, yet Zhou never discusses it to the same extent that we find in the writings of later Neo-Confucians. Nor does he differentiate it explicitly from “Principle” (li).  In the “Explanation,” Zhou speaks of the wu xing as the basic phases of qi, and hence fundamental to the workings of the cosmos, going on to note that “Only humans receive the finest and most spiritually efficacious [qi].” This statement implies that human nature is unique; people have a special status in the world albeit not as beings of a different order than the myriads of other things.  Joseph Adler suggests that for Zhou, humans naturally manifest shen because they are endowed with the most refined qi.  It is due to the functioning of shen, then, that we are able to encompass all things.  Here, Zhou clearly anticipates later Neo-Confucian views concerning human cultivation as a refining of qi, although he does not speak of differences between people in terms of the “coarseness” and “refinement” of qi. We should note, however, that he does not articulate the full explanation we find in Zhu Xi’s works.

c. Authenticity as Humanity’s Ethical and Ontological Basis

Following the spiritual current of Confucian tradition exemplified in Mencius and the Zhongyong, Zhou maintains that authenticity (cheng) is essential to be fully human. In fact, Zhou opens the Tongshu by declaring, “Being authentic is the foundation of the sage.”  He goes on to add that it is “the foundation of the Five Constant [Virtues]” as well as being “perfectly easy, yet difficult to practice.”  Later, he underscores this rather paradoxical point by saying, “In being authentic, there is no [intentional] acting.”  This seems decidedly Daoist (Zhou actually uses the term wuwei here), but Zhou’s meaning can only be understood through it.  For Zhou, authenticity expresses human nature as it truly is; to be authentic is to manifest one’s Heavenly endowment.  Speaking metaphorically, to be authentic is to remain  still in one’s nature while acting in the world.  Authenticity is, thus, both ontological and ethical; it is a manifestation of our fundamental being, while also serving as the root of moral activity.

For Zhou, being authentic is intimately tied to self-cultivation, a central concern of Song Confucianism that forms the heart of Neo-Confucian spirituality.  In some sense, authenticity is a “given,” as it is rooted in our nature, yet we must work to develop it, just as with any innate ability.  Zhou stresses the importance of such ethical/ontological striving throughout the Tongshu.  Moreover, Zhou states that it is possible to be inauthentic (bu cheng) when in chapter 2 of the Tongshu he speaks of the Five Constant [Virtues] and the “hundred practices”  of moral behavior as being “wrong” or “blocked by depravity and confusion.”  Presumably, such cases arise when one is gripped by selfishness and egotism.

One of the most intriguing and controversial points that Zhou makes about striving for authenticity is that, being authentic, a way of retuning to one’s true human nature, is also the way for a person to “become One” (yi).  Moreover, Zhou also says that to be in such a state is “to have no desire.”  Zhou strikes a decidedly mystical tone here, with a slight ascetic edge that resonates strongly with Buddhism and Daoism.  Contra Max Weber, the sociologist of religion who famously distinguished between ascetic and mystical forms of religion, Zhou suggests a spirituality that straddles this dichotomy.  Certainly when read in context, Zhou actually seems to mean a state of clear, yet active engagement with one’s situation.  Zhu Xi and later commentators, perhaps at pains to distance Zhou from accusations of Buddhist and Daoist influence, explain that Zhou means that one should attain an unbiased, undistracted state rather than renounce the world.

d. Inseparability of Ethical Life from the Workings of the Cosmos

As we have seen in his understanding of authenticity, Zhou also proclaims the integral relationship of cosmology and ethics. While this is a central theme in the “Explanation” and the Tongshu, one of the best hints of this point comes in “On the Love of the Lotus,” where he refers to the lotus blossom as the “gentleman (junzi) among flowers.”  The junzi, the “noble person,” is the highest ethical ideal in early Confucianism and, essentially, the equivalent of the Sage in Neo-Confucian tradition. What’s more, not only is Zhou speaking of a natural phenomenon — a blossoming lotus flower — in moral terms here, he is also underscoring the deeply aesthetic dimension involved. Like the beautiful lotus, so the junzi marks the full flowering of human life.

The intertwining of the ethical and cosmological in Zhou’s thought shows, above all, in his practical focus.  Throughout the “Explanation” and the Tongshu, Zhou speaks of our sagely dimension in dynamic, active terms.  Be it in his admonitions regarding continual striving, his reminders of the importance of ordering society, and his cautious approach to acting in the world, Zhou maintains that the moral life reflects the cosmic order; sagely behavior is in tune with the creative guidance of Heaven and the nurturing vitality of Earth.

In his work, Zhou freely mixes metaphysical and ethical language, switching from one to the other effortlessly, like a sage acting in accordance with the cosmos by establishing a good society following Confucian moral teachings. Thus as he notes in the “Explanation,” “Only humans receive the finest and most spiritually efficacious [qi]. Once formed, they are born; when spirit (shen) is manifested, they have intelligence; when their fivefold natures are stimulated into activity, good and evil are distinguished and the myriad affairs ensue. The sage settles these [affairs] with centrality (zhong) and correctness (zheng), humanity (ren) and rightness (yi). . ..”

One final point that has some bearing on the inseparability of ethics from the working of the cosmos in Zhou’s work is how it may anticipate some of the views of Wang Yangming (Wang Shouren, 1472-1529), specifically the inseparability of  “innate (moral) knowledge” (liangzhi) from action.  In Zhou’s perspective, a sage is rooted in authenticity; as he says in the Tongshu, “being a sage is nothing more than being authentic.”  Moreover, he later states, “Being perfectly authentic, one acts.”  In other words, to be a sage is to act in an authentic (sagely) way.  In a similar vein, Wang explains to his student Xu Ai in Instructions for Practical Living (Chuanxilu), “There have never been people who know but do not act.  Those who are supposed to know but do not act simply do not know yet.”   It seems that both Zhou and Wang would agree with Socrates’ famous dictum that “to know the good is to do the good.”

e. Sageliness as Ideal for Daily Life

The concept of sageliness as an idea to be actualized in daily life is implicit in the previous point regarding Zhou and Wang Yangming. Even a cursory reading of the “Explanation” and the Tongshu reveals Zhou’s concern for putting sagely ideals into practice.  As Zhou says, “To be active and correct is called the Way.”  In the introduction to A Short History of Chinese Philosophy, Feng Youlan quotes one of his colleagues as saying, “Chinese philosophers were all of them different grades of Socrates. . . With him, philosophy was hardly ever merely a pattern of ideas exhibited for human understanding, but was a system of precepts internal to the conduct of the philosopher.” (A Short History of Chinese Philosophy, 10).   This passage reads as if it were written specifically about Zhou himself. Clearly for Zhou, the true goal should be to realize sageliness, that is, to discover it and make it concretely real here and now.

Zhou makes clear that the sage as ideal must be engaged with society and the larger world. Not only does the sage “settle these [affairs],” according to the “Explanation,” but Zhou gives extensive guidance for sagely action in the world throughout the Tongshu.  Perhaps his most succinct discussion comes in chapter 6: “The Way of the sages is nothing more than humanity and rightness, centrality and correctness. Preserve it and it will be valuable. Practice it and it will be beneficial. Enlarge it and it will match Heaven-and-earth.” The sage is actively involved with things, guided by morality rooted in the cosmos.  We should remember, though, that this ideal is also profoundly spiritual, suggesting an “inner worldly mysticism” that embraces all of life.

Understandably, Zhou’s concern for sageliness manifests in the various models he upholds for our emulation.  The most obvious example is Confucius, whom Zhou often quotes and to whom Zhou explicitly devotes two chapters (38 and 39) of the Tongshu Zhou also holds up Confucius’ disciple Zilu and the legendary Fuxi, who is credited with writing the hexagrams of the Yijing.  However, Zhou reserves special reverence for Yan Hui, that most spiritual of Confucius’ disciples.  Thus when discussing the comprehensive nature of the sage, Zhou writes, “Master Yan was the one who brought out the Sage’s comprehensiveness and taught ten thousand generations without limit.  Was he not equally profound?”  Interestingly, Zhou himself plays a similar role for later Neo-Confucians, who held him up as a model of authenticity.

4. Principal Concerns

As is the case with all significant thinkers, Zhou Dunyi’s work provides a wealth of material for further analysis.  Some of the concerns that Zhou deals with are of universal philosophical interest while others are rather unique to Chinese, or even more specifically, Confucian, thought.

a. Lineage

In traditional Chinese culture, wherein family relations lie at the center of social life and identity, lineage is paramount.  This is true not just socially and politically, but in scholarly circles as well; after all, most “schools” of Chinese thought are called jia (“family”).  Indeed, it is a cliché to say that Chinese society is envisioned as a large family with the emperor (“Son of Heaven”) as its father. To be true to one’s jia is crucial; to deviate from its ways or to step outside its bounds is to bring shame upon the larger family, including the ancestors, and risk severe punishment, even ostracism.  To have a disreputable lineage or one that is haphazard or unknown is highly suspect in polite circles.  For scholars, lineal connection to earlier thinkers is a necessity, since that helps certify that one has truly received Dao.  The Way, if it is to continue, must be transmitted to succeeding generations.  The fact, then, that Zhou’s teachings have a questionable lineage was a major concern in later Confucian circles. In the preface to his “Conversations of Master Zhu, Arranged Topically” (Zhuzi yulei) 94:3153, Zhu Xi gets to the heart of the matter when he says, “No one knows where his (Zhou Dunyi’s) teaching tradition came from.”

Most contemporary scholars agree that Zhou’s inclusion in the “orthodox” lineage of Song Neo-Confucianism is due to efforts of Zhu Xi in the late 12th century.  Almost from the start, Zhu faced conflict from various sources, notably the Lu brothers, Lu Jishuo (1120s-1190s) and Lu Jiyuan (1139-1193), two literati who argued that Zhou is far too Daoist to be considered a recipient of the Confucian Way.  In addition, there are historical issues with Zhou’s alleged connection to the Cheng brothers, among them the fact that Cheng Yi declares that his older brother Cheng Hao personally rediscovered the Way via his study of the Classics.  What’s more, neither of the Chengs refer to Zhou in terms typically reserved for teachers; instead, they call him by his personal names. Additionally, none of the Chengs’ disciples even mention Zhou in their writings. All together, these points call into question Zhou’s place in the direct line of Confucian transmission.

Joseph Adler and others investigated the historical and biographical records and discovered that during the latter part of Zhu Xi’s life there was a concerted effort on the part of Zhu and Hunan scholar-officials to elevate Zhou to sagely status despite prevailing opinion at the time – an endeavor that culminated during the reign of Emperor Lizong (1225-1264).  For his part, Zhu Xi sidesteps the tenuous historical connection by attributing the source of Zhou’s sagely mind to a transcendent source.  Thus, as Zhu writes in a record of his personal pilgrimage to the place of Zhou’s study:

“As for Master [Zhou] Lianxi, if he did not receive the propagation of this Dao by Heaven, how did he continue it so easily after such a long interruption, and bring it to light so abruptly after such extreme darkness? . . The Five Planets were in conjunction in Kui [a phase of lunar activity used to structure the ancient Chinese calendar], marking a turning point in culture. Only then did the heterogeneous qi homogenize and the divided [qi] coalesce; a clear and bright endowment was received in its entirety by one man, and the Master [Zhou Dunyi] appeared. Without following a teacher, he silently registered the substance of the Way, constructed the Diagram and attached a text to it, to give an ultimate foundation to the essentials. . . Ah! Such grandeur! Were it not for what Heaven conferred [on Zhou], how could we participate in this?”

Such appeals to Divine Authority, however, raise philosophical problems too numerous to discuss here.

b. Daoist and Buddhist Influences

Daoist and Buddhist influences on Zhou’s thought also warrant serious attention, particularly in light of the controversies surrounding Zhou’s lineage and Zhu Xi’s rather strained efforts to rope him into the Confucian camp. One common, albeit simplistic, view of Neo-Confucianism is that it began in the Southern Song (1127-1279) in response to widespread political, social, and cultural dislocation after the collapse of the Northern Song (960-1126).  With the loss of Chinese territories, especially the Yellow River Valley, the traditional Chinese “heartland,” to non-Han invaders, various scholar-officials sought to re-claim a distinctly Chinese identity linked to Confucianism. As part of their efforts, they reformed the civil service system by purging it of Daoist and Buddhist elements. In doing so, they also diminished the political and institutional power of both rival “Ways” and articulated a philosophically robust Confucian philosophy that could hold its own against Buddhist and Daoist wisdom.  Ironically, most contemporary scholars agree that Neo-Confucianism owes a great deal to Daoist and Buddhist ideas and practices.

Without doubt, Zhou’s connections to Daoism are deep.  The diagram that Zhou uses in his “Explanation,” for instance, strongly resembles several others used by Daoists, such as the Wujitu (“Wuji Diagram”), which is included in the Daoist Canon (Daozang) and the Xiantian taiji tu (“Taiji Diagram which predates Heaven”).  While there is some debate about the details, the prevalent view is that Zhou received his diagram from Mu Xiu (979-1032), a minor official who himself received it from Chong Fang (956-1015), a former official turned recluse. Chong Fang, in turn, received the diagram from Chen Tuan (d. 989), a famous Daoist master.  Several key terms that Zhou uses – wuji and wuwei, for example – also have Daoist associations, and Zhou’s priority on “stillness” over “activity” also has a strongly Daoist overtone.

Zhou’s work also shows marked influence of Buddhism.  For instance, Cheng Yi refers to Zhou as a “poor Chan [Zen] fellow,” and records indicate that Zhou counted several Buddhists among his friends and teachers, notably Shou Ya, a master at the Helin Temple in Jiangsu province.  It is possible that Zhou was even a Buddhist layman (upasaka) for a time.  Some scholars suggested connections to the work of Guifeng Zongmi (781-841), a patriarch in both the Chan and Huayan schools of Chinese Buddhism.  Zhou’s discussion of the sage as having “no desire” and being “impartial” also resonate with the Buddhist virtue of upeksha (“equanimity”) and the ideal of mahakaruna (“great compassion”).

All told, it is impossible to deny influences, direct and indirect, of Daoism and Buddhism on Zhou Dunyi.  The various issues surrounding such influences on Zhou may not matter much, however, to students of global philosophy.  In fact, they may only be problematic for those who share a more traditional Confucian concern for purity of lineage, or for scholars who approach the study of Chinese (and, indeed, all of East Asian) philosophy and religion with more Western assumptions of exclusivity.  This is not to deny the historical difficulties in pinning down Zhou’s religious and philosophical pedigree or the problems it caused later Confucian thinkers, but only to note that such concerns in no way detract from his philosophical and spiritual insights.

c. Criticism of Other Thinkers

Zhou’s oracular style (characterized by pronouncements), the fact that his writings consist mainly of commentary on the Classics, and his overall religious tone give the impression that he is not a “philosopher” in the modern academic sense. He is not, in other words, a thinker who critically engages with other thinkers, using logical arguments to disprove certain truth claims while establishing other ones;  however, when we read carefully, we can see a number of implicit criticisms of rival thinkers.

One example is in the Tongshu, chapter 16, where in distinguishing “things” (having physical form) and “spirit” (shen) he observes, “Things, then, are not penetrating. Spirit renders the myriad things subtle.”  This seems to be a counter assertion to the Huayan Buddhist doctrine of shi shi wu ai (“unobstruction of all phenomena,” that is, the interpenetration of all things).  Shi shi wu ai, according to Neo-Confucians, effectively denies the reality of the actual world.  In the next chapter, which is devoted to music and ritual, Zhou laments the present state of society: “Later generations have neglected ritual. Their governmental measures and laws have been in disorder. Rulers have indulged their material desires without restraint, and consequently the people below them have suffered bitterly.” This reads like standard Confucian boilerplate but its critical edge is unmistakable. In chapter 24, Zhou states, “The most revered thing in the world is the Way; the most honored is virtue; the most rare [difficult to attain] is the human being.”  While the echoes of Laozi are unmistakable in Zhou’s praise for Dao and de, the fact that he immediately goes on to praise human beings as having a special status strikes a decidedly Confucian tone.  Moreover, there are other examples of Zhou’s critical stance in the Tongshu. For instance, ( Zhou criticizes superficial scholars in chapters 28 and 34 whom, he says, are concerned with elegant literary style rather than striving for sageliness–a common Confucian theme.

These passages remind us that Zhou’s work did not emerge in an intellectual vacuum.  He worked from a perspective deeply informed by certain basic ideas and assumptions that arose within a highly complex and contested philosophical milieu.  Thus, as we can see, Zhou Dunyi takes a strongly critical stance in much of his writings.  Moreover, he offers insightful, albeit oblique, observations that shed light not only on his own context, but that also address ethical, political, and metaphysical issues that crop up in other cultural contexts – one of the hallmarks of any great thinker.

d. Quietism

From a global philosophical perspective, Zhou seems to espouse a form of quietism, in that he emphasizes a more interior, contemplative approach to life rather than acting boldly to shape events through force of will.  Although “quietism,” strictly speaking, refers to a Christian theological position that held sway during the 17th century before being declared heretical by the Vatican, the centrality of attaining a detached, serene state of mind within Zhou’s writings strongly resonates with quietist doctrines.  Such accusations of quietism are related to criticisms about the seemingly undue influence of Daoism and Buddhism on Zhou as well.

The charge of quietism is understandable in light of Zhou’s view of the relationship between stillness and activity.  Stillness and activity co-entail each other, and, in fact, are just another way for Zhou to explain the interaction of yin and yang.  Furthermore, Zhou does give priority to stillness as well – something several later Neo-Confucians express concerns about.  The distinctly religious dimensions of Zhou’s work also make it easy for critics to dismiss him, especially in light of common stereotypes about mysticism as an excuse to withdraw into a timidly pious passive acceptance of things “just as they are.”

Nonetheless, arguments that Zhou espouses a passive quietism are, at best, straw men.  Whatever his mystical inclinations, Zhou seems firmly focused on practical affairs.  He draws heavily on Confucian directives on how to live a good life, and, in the Tongshu, explicitly attends to stereotypically “Confucian” concerns about education, ritual, and the proper governing of society, including the necessity for punishing wrongdoers.  Even more to the point, Zhou provides clear instruction about activity, saying that one should pay attention and take great care when wielding powerAs a thinker imbued with a sense of the Classical Chinese cultural heritage, Zhou repeatedly seeks guidance for engaging with life in authoritative sources, most especially the Yijing but also other Confucian texts such as the Analects, thereby anticipating Zhu Xi’s later comment that studying the classics is like meeting the sages face-to-face.  Furthermore, as we have seen, Zhou holds up examples from Confucian history as models for our own behavior.  While there are aspects of quietism in much of Zhou’s work, overall he does not advocate passive withdrawal, but a wise and attentive way of participating in the world without recklessly forcing it to conform to our selfish desires.

e. The Problem of Evil

Explaining evil, destruction, pain, cruelty, and so forth, has been a perennial problem for philosophers throughout history.  Numerous solutions have been proposed over the centuries, ranging from the Christian doctrine of “original sin,” to the Buddhist and Hindu teaching that we are bemired in samsara (literally “wandering through,” the beginningless cycle of birth-and-death) due to fundamental ignorance underlying our incessant cravings and selfishness.  For Chinese thinkers in general, evil is due to departure from Dao, which results in disharmony within individual, society and the world. Confucians are divided on some of the particulars here. Mencius, for example, holds that humans are innately good while Xunzi maintains that people are essentially animalistic. Both agree, however, that human beings can improve through the influence of a proper education and virtuous government.

Zhou by and large assumes a Mencian view of innate goodness, but he never spells it out explicitly.  In the “Explanation” he states, “Only humans receive the finest and most spiritually efficacious [qi].”  This seems to be an allusion to Mencius’ remark about nourishing his “vast, flowing qi” as a crucial component to moral and spiritual cultivation (Mencius 2A2), and certainly this is how Zhu Xi interprets Zhou.  This view of human nature is also confirmed by passages in the Tongshu such as chapter 20, where Zhou affirms that sagehood can be learned by adhering to “the essentials” (being unified, without desire, clear, impartial, and so forth), most of which are associated with the exercise of moral virtue rooted in our Heavenly endowment.

Still, while Zhou clearly speaks of the fundamental goodness of humanity, he barely touches on evil itself.  Of human beings, he says in the “Explanation,” “when their fivefold natures are stimulated into activity, good and evil are distinguished and the myriad affairs ensue.”  Zhou repeats the same idea in the Tongshu, adding only that “In incipience there is good and evil.”  The idea seems to be that good and evil, properly understood, only arise with the start of actual human activity.  Does Zhou mean that one’s inherently good human nature, when coming into contact with external things, can give rise to actual good or evil affairs?  It is unclear, but Zhou’s statements definitely provoked many later commentators.  It is really only after Zhang Zai’s explanation of the role of qi that Neo-Confucians had a way to reconcile the Mencian view of fundamental goodness with the undeniable existence of evil in the world.

5. Legacy

Zhou Dunyi was a major influence on the development of Neo-Confucian metaphysics while the spiritual dimensions of his work continue to resonate with various thinkers.  Wing-tsit Chan declares that the most accurate estimation of his work can be found in the comments of the later scholar Huang Bojia (1695), a passage that deserves to be quoted in full:

Since the time of Confucius and Mencius, Han (206 B.C.E.-220 C.E.) Confucianists merely had textual studies of the Classics. The subtle doctrines of the Way and the nature of man and things have disappeared for a long time. Master Zhou rose like a giant. . . . Although other Neo-Confucianists had opened the way, it was Master Zhou who brought light to the exposition of the subtlety and refinement of the mind, the nature, and moral principles.” (quoted in Chan, A Source Book in Chinese Philosophy, 461; pinyin romanization substituted for Wade-Giles in original).

 C. Graham, however, argues in his landmark Two Chinese Philosophers: The Metaphysics of the Brothers Ch’eng that Zhou had little direct influence on these seminal thinkers. Certainly in light of evidence that Zhu Xi’s creative work in establishing the orthodox “transmission of the Way” (Daotong), we should not consider Zhou to be the historical “founder” of Neo-Confucianism.

Still, while any direct connection between Zhou and later Neo-Confucians is tenuous, his inspirational role cannot be doubted.  One famous story, attributed to Cheng Hao in Reflections on Things at Hand, says that Zhou refused to cut the grass growing outside his window, saying, “[The feeling of the grass] and mine are the same.” While this tale seems the stuff of hagiography, it does give us a sense of the reverence for Zhou within Confucianism.  Indeed, as an affirmation of the fundamental continuity of all life, this story is a poignant example of what living out Zhou’s metaphysical vision might be like.  Such stories have helped cement the image of Zhou as a “latter day Sage,” an image that fits well with the specific models of Sageliness he holds up ( Yan Hui, Confucius, to name two). In this regard, it is noteworthy that in chapter 14 of Reflections on Things At Hand, entitled “On the Dispositions of Sages and Worthies,” Zhu Xi says of Zhou that “[his] mind was free, pure, and unobstructed, like a breeze on a sunny day and the clear moon.”  Elsewhere, Zhu says that Zhou’s mind was “harmonious with the ‘Supreme Polarity’,” and that he “had the joy of Confucius and Yanzi.”

Joseph Adler argues that Zhou’s importance lies in the fact that his work provided a basis for Zhu Xi’s own religious practice. Specifically, Zhou’s teaching on the interrelationship of “stillness” and “activity” enabled Zhu to ground his methods of self-cultivation in the words of an earlier figure revered for his own spiritual example.  Regardless, Zhou Dunyi is a profound thinker whose poetic words still provide philosophical and religious guidance.

6. References and Further Reading

  • Adler, Joseph A. Reconstructing the Confucian Dao: Zhu Xi’s Appropriation of Zhou Dunyi. Albany: State University of New York Press, 2014.
    • The single best scholarly discussion of Zhou’s thought and his place within Neo-Confucianism currently available.  In addition to his insightful analysis of the “Explanation” and the Tongshu, Adler argues that Zhou’s work provided the solution to Zhu Xi’s personal spiritual crisis by providing a cosmological and metaphysical underpinning for Zhu’s own religious practice.  Includes clear annotated translations of the “Explanation” and the Tongshu along with Zhu Xi’s commentaries, prefaces, and postscripts, as well as passages from the writings (commentaries, prefaces and so forth) on Zhou’s work from other Neo-Confucian thinkers.
  • Adler, Joseph A. “Response and Responsibility: Chou Tun-I and Neo-Confucian Resources for Environmental Ethics.”  In Confucianism and Ecology: The Interrelation of Heaven, Earth, and Humans, edited by Mary Evelyn Tucker and John Berthrong, 123-49. Cambridge, MA: Harvard University Center for the Study of World Religions, 1998.
    • Excellent discussion of Zhou’s thought highlighting the ecological dimensions of his ethical/spiritual scheme.
  • Adler, Joseph A. “Zhou Dunyi: The Metaphysics and Practice of Sagehood.” In Sources of Chinese Tradition, 2nd ed., vol. 1, edited by Wm. Theodore de Bary and Irene Bloom, 669-78. New York: Columbia University Press, 1999.
    • Good annotated English translations of Zhou’s “Explanation” in its entirety (including the Diagram itself) along with selections from the Tongshu (chapters 1, 3, 4, 16, and 20).  Includes a useful introductory discussion of Zhou’s life and work.
  • Chan, Wing-tsit, ed.  A Source Book in Chinese Philosophy.  Princeton: Princeton University Press, 1963.
    • A must-read for anyone interested in Chinese thought.  Chan’s own perspective is heavily colored by Neo-Confucianism (particularly the Chen-Zhu line).  Chapter 28 is devoted entirely to Zhou, and includes not only biographical information and philosophical analysis, but annotated English translations of both the “Explanation” and the Tongshu in their entirety.
  • Chan, Wing-tsit, trans. Reflections on Things at Hand: The Neo-Confucian Anthology Compiled by Chu Hsi and Lu Tsu-Ch’ien. New York: Columbia University Press, 1967.
    • Masterful philosophical translation of the primary text of Neo-Confucian thought.  Heavily annotated with a 27-page introduction that includes biographical information about Zhou and the other three “founders” of the Cheng-Zhu line.  Also includes a 25-page glossary of key Chinese terms (Wade-Giles Romanization and traditional characters) and a short (11-page) essay entitled “On Translating Certain Chinese Philosophical Terms.”  Not only does this anthology open with Chan’s translation of the “Explanation,” the index makes it easy to locate all 12 of the passages from Zhou’s writings that Master Zhu included.
  • Fung Yu-lan [Feng Youlan]. A History of Chinese Philosophy. Volume II: The Period of Classical Learning (from the Second Century B.C. to the Twentieth Century A.D.). Translated by Derk Bodde. Princeton: Princeton University Press, 1953.
    • Rather dated but masterful overview of the history of Chinese thought. Like Chan’s Source Book, a must read for students of Asian philosophy.  Section 1 of Chapter XI focuses on Zhou’s thought.  The condensed A Short History of Chinese Philosophy (a single volume distillation of Fung’s larger two-volume work) is also informative.
  • Tu Wei-Ming and Mary Evelyn Tucker, eds.  Confucian Spirituality. Volume Two.  New York: The Crossroad Publishing Company, 2004.
    • Part of the “World Spirituality” series, this collection of nearly 20 essays examines Confucian religious thought and practice from the Song era down to the present, covering the spread of Neo-Confucianism to Korea, Japan, Vietnam and its development into a truly a global tradition. Although Zhou Dunyi is not the focus of any specific essay, discussion of his thought and influence figure prominently in several pieces in the first part of the volume.
  • Tu Wei-Ming. Confucian Thought: Selfhood as Creative Transformation. Albany: State University of New York Press, 1985.
    • Classic discussion of the spiritual dimensions of Confucian tradition (particularly the more Mencian Neo-Confucian dimensions) by its foremost proponent.  While not explicitly devoted to Zhou, Tu’s discussion illuminates themes that run throughout the Song master’s work.
  • Wang, Robin. “Zhou Dunyi’s Diagram of the Supreme Ultimate Explained: A Construction of the Confucian Metaphysics.” Journal of the History of Ideas 66/3 (July 2005): 307-323.
    • Highlights ways that Zhou’s thought traces a notion of gender complementarity in his depiction of human beings as arising from and embodying the original and sustaining energies of the cosmos (yin and yang). Human persons are its highest exemplification and as such are a prime phenomenon of this dynamic cosmic creation.
  • Zhou Dunyi.  Zhou Dunyi ji (Collected Works of Zhou Dunyi).  Edited by Chen Keming. Beijing: Zhonghua Shuju, 1990.
    • Good contemporary Chinese edition of Zhou’s primary works.

 

Author Information

John Thompson
Email: john.thompson@cnu.edu
Christopher Newport University
U. S. A.

Charles Hartshorne: Dipolar Theism

HartshorneFrom the beginning to the end of his career Charles Hartshorne maintained that the idea that “God is love” was his guiding intuition in philosophy. This “intuition” presupposes both that there is a divine reality and that that reality answers to some positive description of being a loving God. This article focuses on the latter issue, namely, Hartshorne’s concept of God. Hartshorne’s views on the former issue are treated separately in another article, “Charles Hartshorne: Theistic and Anti-Theistic Arguments.” Hartshorne vigorously defended both propositions by clarifying what he meant by the phrase, “God is love,” by defending his views against a variety of objections, and generally by arguing that his version of theism (called “dipolar” or “neoclassical” theism) survives critical scrutiny better than its philosophical competitors.

Heavily influenced by Alfred North Whitehead, Hartshorne borrowed some of Whitehead’s technical vocabulary and he often promoted broadly Whiteheadian ideas. It is a mistake, however, to style him as Whitehead’s disciple for he departed from the older philosopher on a number of points, most notably (where this article is concerned), on questions surrounding the concept and the existence of God. In what follows, Hartshorne’s ideas about the concept of God are examined. It is important, however, to appreciate that the formulation of a coherent theism is an integral part of the rational defense of theism. Hartshorne spent much of his career in a philosophical atmosphere in which the question was not so much “Does God exist?” as it was “Does ‘God’ name a coherent idea?” Philosophers from very diverse schools of thought—from Sartre to the Logical Positivists—rejected theism on the basis of alleged inconsistencies in the very idea of deity. Hartshorne himself remarked that there would be fewer atheists if theists had done a better job of making sense of the concept of God. Hartshorne’s response to this situation was to develop his dipolar or neoclassical concept of God. It can plausibly be claimed that Hartshorne accomplished at least two tasks: first, he introduced a sophisticated and religiously important form of theism heretofore unheard of or at least very poorly developed through philosophical argument and, second, he shifted the burden of proof onto those who claim that the concept of God is hopelessly muddled.

Table of Contents

  1. Divine Love and Divine Relativity
  2. Existence and Actuality
  3. Divine Perfection
  4. Divine Power
  5. Divine Knowledge
  6. Panentheism
  7. Conclusion
  8. Suggestions for Further Reading
    1. Primary Sources
      1. Books (in order of appearance)
      2. Hartshorne’s Response to his Critics
      3. Selected Articles
    2. Secondary Sources
    3. Bibliography

1. Divine Love and Divine Relativity

The only deity worthy of worship, Hartshorne believed, is one that could be described as “Love divine, all loves excelling,” as in the title of Charles Wesley’s hymn. Hartshorne did not identify himself as Christian nor did he consider himself a theologian. He argued, however, that Christian thinkers had an unfortunate tendency to allow what he considered to be warped ideas about absolute power and unchanging perfection to eclipse the central teaching of their faith concerning divine love. The parables of Jesus and the personal qualities he exhibits in the Gospels reflect, for the Christian, the image of a loving God. They portray one who not only acts for the benefit of the beloved but also sympathizes with others in such a way as to rejoice in their well-being and feel sorrow in their tragedies. These are the qualities of love that Hartshorne takes to be essential to it; at a bare minimum, love requires both the capacity to act for the welfare of others and to sympathize with their feelings. As the etymology of “compassion” suggests, it is “to suffer with” another in the desire to ameliorate the other’s suffering. If this sort of love is to be attributed to the divine being, then it must not only be possible for God to act for the welfare of the creatures but also to be affected by their weal and woe. In short, divine love entails the divine relativity: a social conception of God—the title of Hartshorne’s fourth book, published in 1948, now considered a classic in the philosophy of religion.

Divine relativity is precisely what much of traditional theology would not allow. As Aquinas said in Summa Theologica, God is really related to the creatures but the creatures have only a rational (that is, an imagined) relation to God (ST I, Q 13, a. 7). In short, God is impassible or unaffected by anything external. The only doctrine of divine love consistent with the doctrine of impassibility is one in which God promotes the welfare of the creatures, but is unaffected by what happens to them. On this view, divine love, unlike human forms of love, involves neither sympathy nor empathy. John Sanders demonstrates in The God Who Risks that Christian thinkers, from as early as Justin Martyr, realized that there is a tension between the belief in the goodness of God and the denial that God somehow shares in the joys and sorrows of the creatures. Anselm raised the question explicitly in chapter 8 of Proslogion: How can God be all-loving without any sympathetic responsiveness? Anselm answered by promoting a kind of theological behaviorism: we feel the effects of God’s goodness, but God feels nothing. On Hartshorne’s view, this doesn’t answer the question, it only reasserts divine impassibility.

Hartshorne affirms God’s love as involving both benevolence and feeling. Because God loves the creatures, what happens to them is felt also by God. As a loving parent suffers for a child who is ill or who has lost her way in life, so the God in whom Hartshorne believes, suffers through the misfortunes and the mischief of the creatures. He was fond of quoting one of the final statements from Whitehead’s Process and Reality that “God is the great companion—the fellow-sufferer who understands.” Hartshorne, following both Whitehead and Berdyaev, maintained that there can be tragedy, even for God. As Martha Nussbaum argues, tragedy can happen only to someone who cares enough about others to be disappointed by them or hurt by what happens to them. God, in Hartshorne’s view, is one who cares and who can therefore be disappointed or hurt by the actions of the creatures.

Hartshorne’s basic argument for divine relativity is stated throughout his writings. If God knows contingent states of affairs (for example, a woman listening to a bird singing at a particular time and place), then there must be contingency in God. For, if the object of knowledge can be other than it is (for example, the woman not listening to the bird), then the knowledge itself could be otherwise (for example, God knowing that the woman is not listening to the bird). The argument is not that God might have failed to be omniscient, but that the particular cognitive states of God could have been different. As Hartshorne noted, Aristotle inferred from this reasoning that God does not know the world; Spinoza, on the other hand, denied the contingency of the world—despite what seems to be the case, it is impossible, at that very moment, that the woman not be listening to the bird. Hartshorne concludes that one must choose among the mutually exclusive options: a God that is ignorant of the world, a world devoid of contingency, or the neoclassical view that there is contingency in God. What is ruled out by this argument is the Thomistic view that God knows contingent states of affairs but there is no contingency in God. (For different formalizations of this argument see Shields 1983 and Viney 2007/2012.)

Hartshorne’s basic argument for divine relativity is expressed in terms of the idea of God’s exhaustive knowledge but it could equally well be rephrased in terms of inexhaustible love, for love, like knowledge has it objects. Of course, these are not the only qualities that theists usually ascribe to God—there are also such qualities as eminent creativity, perfect power, and infinite wisdom. Hartshorne attempts to do justice to these ideas in formulating his neoclassical concept of God, but for him divine love remained paramount. This is significant for it highlights Hartshorne’s commitment to the principle that negation is parasitic upon positive attributions, that there are no merely negative facts (see “Charles Hartshorne: Neoclassical Metaphysics”). Many theologians, eager to affirm the transcendence of God, emphasize what cannot be known of God and argue that, in view of this ignorance, the most appropriate theological language is by way of negation (via negativa): God is not finite (infinite), not changeable (immutable), not affected by anything external (impassible), not contingent (necessary), not in time (non-temporal), and so forth. Hartshorne also emphasized what is not known of God and he did not deny that negations play an important role in religious discourse. In A Natural Theology for Our Time, he comments that our knowledge of the concrete divine reality is “negligibly small.” He argues, however, that as the sole or even primary approach to religious language, “the negative way” is a case of false modesty. Negative theologians are supposedly being deferential to God by stressing what cannot be known or said of God, but this masks the fact that they consider themselves privy to enough knowledge about the divine reality to know what cannot be attributed to it.

Hartshorne couples the accusation of false modesty with the charge that the negations used of deity by negative theologians almost invariably presuppose invidious contrasts: the finite is inferior to the infinite, the changeable to the unchangeable, the passible to the impassible, the temporal to the non-temporal, and so forth. Hartshorne argues that it is much too simplistic to label one side of an ultimate contrast as “better” and the other side “worse.” On the contrary, there are better and worse forms of each side of each contrast. For example, there are better and worse ways of being affected by others (passibility) and better and worse ways of being unaffected by others (impassibility): to identify too much with the suffering of others is damaging to one’s own well-being and may prevent one from helping others in need; to remain unaffected by the plight of others exhibits the character flaw of insensitivity. In Hartshorne’s view, theologians should not chase after negations if they wish to speak of one that is worthy of worship; rather, they should explore ways of attributing to God what is best in both sides of any particular contrast. For this reason, Hartshorne maintains that, to the extent that language is adequate to theological purposes, only a properly dipolar concept of deity can reflect the divine perfection: God is both finite and infinite, both passible and impassible, and so forth, but in different respects and in eminent ways.

In Analytic Theism, Hartshorne, and the Concept of God, Daniel Dombrowski notes that Hartshorne sought a theory of religious language that avoids two extremes: (1) language is wholly inadequate to describe God and (2) verbal formulae may capture God without doubt or obscurity. Hartshorne considered the formal abstractions of metaphysics to be the most nearly univocal language that is possible for deity, for they do not admit of degrees. For example, on Hartshorne’s view, God is, in different respects, necessary and contingent; we shall see, however, that this does not mean that God is more or less necessary or more or less contingent. Hartshorne calls the most nearly equivocal language about God “symbolic” because it presupposes particular times, places, and situations. Metaphors such as “shepherd,” “mother,” “father,” are examples. Analogical language holds a place between the abstract contraries of metaphysics and the concrete imagery of poetic imagination. Analogical language is a matter of degree, as when one says that love comes in many forms, but the eminent form of love belongs to God. In Beyond Humanism, Hartshorne claimed that psychical predicates such as memory, feeling, and volition admit of an infinite variability, extending beyond their specifically human forms to include the non-human animal world and to include what might exist in a superhuman form, such as deity. Hartshorne sometimes says that these sorts of predicates only apply literally to God and not the creatures. As Dombrowski avers, the most parsimonious interpretation of this “negative anthropology” is that Hartshorne is emphasizing that God alone has the supreme or eminent form of these qualities.

2. Existence and Actuality

To say that God exhibits both sides of a metaphysical contrast would be a logical contradiction unless there was a way of showing that the polar extremes apply to God in different respects. Søren Kierkegaard seemed to relish the paradox that “the eternal came to be in time.” Hartshorne did not mention Kierkegaard in this connection, but he apparently saw little advantage in this way of speaking. In The Divine Relativity, he complained that a theological paradox seems to be what a contradiction is when applied to God. In Hartshorne’s view, asserting contradictory things of God is not a sign of profundity but of confusion. Hartshorne’s proposal is to make a three-fold distinction of logical type, applicable to both God and the creatures, among existence (that a thing is), essence (what a thing is), and actuality (the particular state in which a thing is). To illustrate how this distinction can be applied to both God and the creatures, consider the case of a woman listening to a bird sing and of God knowing this fact. The woman exists, has the cognitive capacity to hear song birds, which is part of her essence (insofar as audition of is part of her natural endowment) and she is currently listening to a bird sing, which is her actual state. The same distinctions apply to God: God exists, has the essence of being all-knowing, and is in the actual state of knowing that the woman is listening to the bird sing.

The tripartite distinction of existence, essence, and actuality is one of logical type analogous to the logical type difference between universals and particulars. One may, for example, deduce that the woman exists if she is listening to the bird, but one may not deduce from the fact of her existence that she is listening to a bird. For this reason, Hartshorne maintains that existence (also essence) is abstract relative to actuality. Actuality is, so to speak, information rich, relative to existence (and essence). This is recognized in modern logic in the use of the existential quantifier which, by itself, gives no details about the existent object. Hartshorne’s three-fold distinction also allows one to make a distinction within God between what is necessary (could not be otherwise) and what is contingent (could be otherwise). It is conceivable that God exists necessarily and necessarily has the quality of being all-knowing, but the actual state of God’s knowing (for example, knowing that the woman is listening to a bird sing) might be contingent. Barring determinism, the woman’s listening to the bird is contingent: she might have been asleep, she might have been listening to a different bird, she might have been distracted, and so forth. If God is necessarily all-knowing, then God knows about the woman and her actual state, regardless of what it may be. Moreover, God’s actual state of knowing the woman as listening to the bird sing is as contingent as the fact that she is listening to the bird sing. The following diagram summarizes how the distinctions between the concrete and the abstract and the necessary and the contingent map onto Hartshorne’s three-fold distinction of existence, essence, and actuality as it applies to God and the creatures.

Hartshorne graphics

The three-fold distinction is often referred to by means of the simpler distinction between existence and actuality thereby anticipating the thesis of Hartshorne’s ontological argument that existence belongs to the nature or essence of God. One need not accept the ontological argument, however, to appreciate the importance of the distinction. David Tracy calls the distinction “Hartshorne’s Discovery” and Hartshorne himself said, “I rather hope to be remembered for this distinction.” Hartshorne notes that Aristotle anticipated the tripartite distinction of existence, essence, and actuality when he spoke of substance, essence, and accident. Hartshorne’s criticism of the Stagirite is that he considered substance as ontologically basic and thus could speak of accidental compounds. For Hartshorne, actuality is ontologically basic in the sense of being most concrete. In Philosophers Speak of God, Hartshorne writes, “It is actuality of accidents, not existence of substances that is prior” (1953, 72).

The distinction between existence and actuality is important because it allows, among other things, that there can be give-and-take relations between God and the creatures without reducing God to the status of a creature. Contrary to the ancient tradition of divine impassibility, God can be conceived as affected by the creatures. In the example, the woman listening to the bird brings it about that God knows that she is listening to the bird, although she does not bring it about that God is omniscient, for God would have been omniscient even had she never existed. In Summa Contra Gentiles, Aquinas argued that any contingency in God implies the possibility of God’s non-existence, thereby reducing God’s existence to the status of creaturely existence (SCG I, 16.2). In view of the difference between existence and actuality, the inference is invalid. God’s actual states can be contingent while God’s existence and essence remain necessary. Moreover, the essence of God must now be described not merely as necessary but as necessarily somehow actualized.

3. Divine Perfection

Hartshorne’s three-fold distinction allows one to appreciate the extent of his divergence from the dominant tradition in philosophical theology which he called “classical theism.” This article has noted that classical theists, committed to the transcendence of God, were keen on the via negativa: God was placed on one side only of the pairs of contrasts, absolute/relative, infinite/finite, immutable/mutable, impassible/passible, necessary/contingent, and eternal/temporal. Hartshorne rejects this as a “monopolar prejudice,” an expression that highlights not only the “monopolar” aspect of classical theism but also the invidious character of the contrasts—the “prejudice”—as applied to God and the creatures. Hartshorne speaks instead of God’s dual transcendence. God transcends the creatures by being the supreme instance of both sides of the contrasts. The distinction between existence and actuality permits a logically coherent doctrine of dual transcendence by distinguishing different aspects of God. For example, God is immutable with respect to existence and essence, but mutable with respect to actuality. That is to say, God’s existence and essence are always the same, but God’s actual states are constantly being added to with the creative advance of the world. Or again, God is both necessary and contingent, but in different respects. God’s existence and essence are necessary (that is, could not be otherwise) whereas God’s actuality is contingent (that is, could be otherwise). The examples of divine mutability and contingency represent God’s flexibility in being able to respond to every possible change. It should now be clear why Hartshorne was making a serious point when he quipped that he believed in twice as much transcendence as was usually found in more traditional forms of theism.

From time to time, Hartshorne has been characterized as promoting a merely finite deity such as one finds in Mill’s essay Theism. Hartshorne’s commitment to the principle of dual transcendence entails that this is mistaken. Insofar as God has actual states, God is indeed finite. Furthermore, God can be nothing other than finite in this respect. God’s actuality is the realization of concrete value in the life of God and every realization of value, whether in God or in any other being, is finite in the sense that it excludes values that could have been achieved. For example, from an early age, Mozart’s father set his son on the trajectory of being a musician. Apart from this education and training, Mozart might have lived a very different life, as a lawyer, a military leader, or a peasant farmer. Each path would have led to a certain value achievement, but each, to a greater or lesser extent, excludes the others. In some fashion, God incorporates Mozart’s achievement into the divine life; as the values Mozart did not achieve were not part of his life, no more are they part of God’s. To say that God is not finite in this sense is to risk accepting a doctrine according to which God is merely infinite—that is to say, that God excludes whatever is of worth in the enjoyment of a finite realization of value. Hartshorne long maintained that the concept of the realization of all possible values is a meaningless ideal. God must, therefore, be finite, but not merely so. Dual transcendence means, among other things, that God must be infinite in receptive capacity; whatever comes to be, comes to be for God and becomes an everlasting component in God’s memory. There must also be in God an inexhaustible or infinite capacity to appreciate the creative advance. In addition, Hartshorne allowed that God is actually infinite in the sense that there was never a time when God did not exist and that God is omniscient with respect to this past life. Hartshorne was quick to add that this form of infinity is not the realization of all possible values, for the actually infinite life of God could have been different in as many ways (an infinite number) as the creative advance itself could have been different.

Classical theologians adopted an ideal of perfection as unchanging, often using the argument from Plato’s Republic that change for the better or worse implies an unchanging measure of perfection. The argument is that if something changes for the better then it is not yet perfect, but if it changes for the worse then it is no longer perfect. In either case, change implies imperfection. God, being perfect, must be devoid of change. This argument, however, begs the question against a dipolar conception of God like Hartshorne’s by assuming that there cannot be perfect forms of change. Hartshorne argues, on the contrary, that some forms of value—aesthetic qualities in particular—do not admit of a maximum. Just as it is impossible to speak of a greatest possible positive integer, so it may be impossible to speak of a greatest possible beauty. The fact that Mozart’s music achieved a new level of beauty does not mean that there was nothing left for Beethoven to do. Another analogy is interpersonal relationships. It is a good thing to be flexible in one’s responses to others. The ideal is not unchangeableness; it is, rather, adequate response to the needs of others. It is true that stability and reliability of character are desirable. But this means, in part, that the person can be relied upon to respond in ways appropriate to each situation, and responsiveness is a kind of change. The analogy is particularly appropriate in the divine case since there are always new creatures to which God must respond and hence there is no upper limit to the values associated with these relationships, for each is as unique as the individuals with whom God is related.

As Hartshorne distinguished existence and actuality, so he distinguished different ways in which God is perfect. Taking a clue from the work of Gustav Fechner, Hartshorne noticed an ambiguity in the concept of perfection. If one is perfect, then one is unsurpassable, but by what or by whom is one unsurpassable? The obvious answer is “by others.” This leaves open the possibility that one may surpass oneself. Thus, there is a distinction between (a) being unsurpassable by all others including self and (b) being unsurpassable by all others excluding self.” In Man’s Vision of God and the Logic of Theism, Hartshorne labels these two ideas respectively A-perfection (for absolute perfection) and R-perfection (for relative perfection). God is A-perfect with respect to existence and essence and R-perfect with respect to actuality. Hartshorne agrees with more traditional theists who spoke of God as infinite, immutable, impassible, necessary, and eternal, for this is God’s A-perfection. Hartshorne quickly adds, however, that God is not in all respects infinite, immutable, impassible, necessary, and eternal. To use our previous example, if aesthetic values exhibit an unlimited possibility of increase, then God’s appreciation of beauty may—indeed must—exhibit this possibility. Again, Beethoven’s music introduces new forms of beauty that did not exist prior to his creative life. Hartshorne would also say that God, in enjoying the changing beauty of the world, is also the supremely beautiful object of contemplation, a point that is returned to in the discussion of panentheism. Hartshorne summarized these ideas about divine perfection in The Divine Relativity when he spoke of God as “the self-surpassing surpasser of all.”

4. Divine Power

Theologians have often commented on how difficult it is to define “omnipotence.” Most of those who have thought about this, Hartshorne included, conclude that René Descartes was wrong, in his letter to Mersenne (May 27, 1630), to suppose that God could bring about logically inconsistent states of affairs. Aquinas, for example, in Summa Contra Gentiles denied that God could draw a circle with unequal radii, for this involves a logical inconsistency: one must fix the angle of the compass in order to guarantee that the arc becomes a circle, but one must at the same time not fix the angle, allowing it to become wider or smaller, in order to make the radii unequal (SCG II, 25.14). Aquinas also denied that God could change the past once it has occurred. In Summa Theologica, Aquinas says that not even God can restore virginity to someone who has lost it (ST I, Q 25, a. 4, reply to Obj. 3). Finally, Aquinas denied that God can do what is contrary to God’s nature, such as doing an unloving deed (ST I, Q 25, art. 3, Reply to Obj. 2). On each of these points, Hartshorne agrees.

Beyond these agreements, Hartshorne attributes both more power and less power to God than did the Angelic Doctor. For Aquinas, God can act but not be acted upon by anything external—this is the doctrine of impassibility. As seen, Hartshorne argues that God has the power to be acted upon by the creatures and to respond to them. In this sense, Hartshorne attributes more power to God than does Aquinas. On the other hand, Aquinas apparently believed that God can unilaterally bring about some states of affairs in which more than one agent makes decisions. For Aquinas, God is called omnipotent because everything that does not imply a contradiction in terms is within God’s power to accomplish (ST I, Q 25, a. 3). Hartshorne rejects this claim and holds instead that any state of affairs in which more than one agent makes decisions cannot be conceived as the product of one agent, even if that agent is God. Suppose Ruth loves Naomi and Naomi loves Ruth—their mutual love can be explained only by referring to the activity of two persons, Ruth and Naomi. The logic of the situation does not change if one of the agents is God. The state of affairs described by God loving Ruth and Ruth loving God can only be explained by the activity of both God and Ruth, and not by God alone. Of course, if God is all-loving, then it is impossible that Ruth (an actual person) not be loved by God; but this does not change the fact that two agents—God and Ruth—are required to create the situation of their mutual love. If this is correct, then it is false that God, acting alone, can bring about any state of affairs in which more than one agent is making decisions. A corollary is that it is false that God can bring about any state of affairs the description of which is logically consistent—for there is nothing logically inconsistent about two individuals loving each other.

Classical theists, Aquinas in particular, are not without responses to Hartshorne’s reasoning. Aquinas made two claims relevant to Hartshorne’s argument. First, he maintained that the self-same result could be wholly attributed to two different causes; perhaps Ruth’s loving God can be wholly attributed to Ruth and wholly attributed to God. In Summa Contra Gentiles, Aquinas’ example is that the music of a flute is wholly attributable to the instrument and to the musician (SCG III, Pt. 1, 70.8). Of course, the music is manifestly not attributable to either the instrument or the musician singly; both are required, which supports Hartshorne’s claim. It is relevant to note that it is illicit to distribute “wholly” through a conjunction. There is no valid inference from “X is wholly the result of (A and B)” to “X is wholly the result of A and X is wholly the result of B.” The second thing that Aquinas says that might undermine Hartshorne’s argument is his claim that God has the power to bring about some events necessarily and to bring about other events contingently (ST I, Q 19, a. 8). In this way, one might make head-way in making sense of the idea that God creates a person’s decision while yet preserving the contingency (an element of freedom) of the decision. Again, however, Hartshorne demurs. It makes sense to say that one can be the cause of a contingent event—every roll of the dice is proof of that. It is much less clear that it makes sense to say that one can guarantee the outcome of a contingent event. If one loads the dice in such a way that a particular number must appear (say, seven), then the outcome is not contingent; only if the dice are not loaded is the outcome truly contingent. Again, one should take note of an illicit distribution, but this time it is the problem of distributing “causes” or “guarantees” over a disjunction. There is no valid inference from “X causes (A or B or C)” to “X causes A or X causes B or X causes C.”

Hartshorne’s most controversial departure from classical theism is his denial of creation ex nihilo. Indeed, the argument just given that some states of affairs require multiple decision makers is itself an argument against ex nihilo creation, at least in its classic form. God was said to create the universe, which includes the decisions that creatures make, in one non-temporal and unilateral act. Hartshorne’s argument entails that no universe with multiple decision makers can be created in its entirety by God alone. Aquinas notwithstanding, the making of decisions is a paradigm of creative activity, for something is brought into existence if only the decision itself. For this reason, Hartshorne’s example of multiple decision makers is also an example of multiple creators. Hartshorne saw in Jules Lequyer’s statement that “God created me creator of myself” an anticipation of his own views on divine creativity. A hallmark of Hartshorne’s neoclassical theism is that the universe is a joint creative product of (a) the lesser creators that are the creatures, localized in space and time, and (b) the eminent creator which is God whose influence extends to every creature that ever has or that ever will exist.

Hartshorne defends a metaphysical view that posits creativity as a transcendental, applicable to both God and the creatures. Creativity, in such a metaphysic, is never “from nothing” but is relational, requiring a pre-existent universe (see “Charles Hartshorne: Neoclassical Metaphysics”). It follows that there can be no such thing as God without a universe or, for that matter, a universe without God. A common objection to this view is that it portrays God as dependent upon the universe. Hartshorne considers the objection to be flawed in two ways. First, it assumes an invidious contrast between independence and dependence. As noted, Hartshorne is at pains to instruct philosophers and theologians to be wary of devaluing dependence (and, more generally, to be cautious of simplistic valuations of metaphysical contrasts). Second, the objection is subtly ambiguous. If Hartshorne is correct, then God and the universe are indeed necessary to each other. The proviso, however, is that no particular set of creatures (that is, no particular universe) is necessary to God. An analogy that Hartshorne uses in Creative Synthesis is of a mathematical set that necessarily has numbers, but the numbers that it has are not necessary. God’s actual states, being contingent, are dependent upon interaction with the creatures; God’s existence, on the other hand, is necessary, for it depends upon no particular creatures or groups of creatures. It should also be noted that Hartshorne preserves the distinction between God and the creatures: the divine being meets with no universe that it did not have a hand in co-creating whereas the creatures, because they begin to exist, are born into a universe that they had no part in making. Of course, once the creature exists, it becomes a lesser, co-creator, with God.

In The Divine Relativity and elsewhere, Hartshorne distinguishes two forms of power involving direct and indirect causation. Direct causal influence occurs when one entity—Hartshorne’s name for the metaphysically basic entities is “dynamic singulars”—acts on another without an intermediary as when a present experience acts upon an immediately subsequent experience in the life of a single individual; one’s memory of the preceding moment, for example, is the feeling of one experience acting on its successor in direct fashion. Hartshorne avers that a similar direct action occurs between parts of the nervous system and between the nervous system and the body. Indirect causal influence, on Hartshorne’s account, occurs when one body acts upon another body, which often involves modifying the inter-bodily environment in some way, such as speaking, which causes air to move and sound waves are heard by another person. Some cases of indirect causal action are examples of “brute force” whereby one body moves another body from one place to another. Barring telepathy, cases of one person acting on another are always indirect. On the other hand, Hartshorne maintains that God’s action on dynamic singulars is never indirect. Because each entity retains its own power of creative experiencing, this direct causal influence is not deterministic. Hartshorne, following Whitehead (who was following the later Plato), refers to this mode of influence as “divine persuasion” which is, in effect, the active side of divine love. God acts as a supreme ideal, urging each dynamic singular to achieve an intensity of experience appropriate to its level of complexity. Thus, in Creative Experiencing Hartshorne says, “It is the [divine] love that explains the [divine] power, not vice versa.”

Some philosophers accept Hartshorne’s critique of the traditional concept of omnipotence but argue that the neoclassical account of divine power does not endow God with the highest degree of power conceivable. One may concede that “divine persuasion” is the most admirable form of power, but insist nevertheless that God should also be conceived as having the ability to thwart human decisions by preventing them from being acted upon or by preventing their natural consequences from occurring. In Divine Power in Process Theism, David Basinger notes that a parent can force an unruly child to go to bed by physically putting the child there. If God is unable to accomplish such a feat then, Basinger argues, God does not have the highest degree of power, for the parent is able to do what God cannot. In response one may note that Hartshorne’s metaphysical principles allow that God has the ability to persuade the child to get into bed or even to persuade the parent to force the child into bed. It is contrary to Hartshorne’s thinking, however, to say that God has a body with a location within the cosmos. This is also contrary to classical theism (also Basinger’s “free will theism”)—the idea that Jesus was God embodied involves metaphysical issues which Basinger’s critique does not presuppose. In view of these qualifications, Basinger’s objection seems to be that if God is to be conceived as having the highest degree of power, God must be able to accomplish miraculously what the parent accomplishes without a miracle through the use of his or her body.

Hartshorne responded to Basinger’s critique in a letter (dated August 4, 1988) and said, among other things, that he doubted that he ever claimed that miracles never occur. He was disinclined to believe that miracles have in fact occurred on grounds similar to those offered by Hume (also Montaigne): probabilities favor deceit or error over genuine miracles. Hartshorne attributed the laws of nature to God’s influence over all dynamic singulars (see the article, “Charles Hartshorne on Theistic and Anti-theistic Arguments: Global Argument”) and said that he doubted our wisdom to judge how far the value of such laws “justifies the absence of notable divine intervention.” Doubting, however, the quality of evidence for miracles is different from doubting the possibility of miracles. Basinger replied to Hartshorne (August 24, 1988) that he wasn’t “quite sure” what it could mean in neoclassical metaphysics to suppose that miracles could occur. This is a fair question, especially in light of Hartshorne’s denial that God acts indirectly. On the other hand, it is fair to ask for an account of divine power that is not merely ad hoc but flows naturally from general metaphysical principles such as Hartshorne was at pains to give. With the possible exception of Descartes’ concept of omnipotence, every account of divine power includes propositions of the form “God cannot X.” The force of the “cannot” may be in the logical impossibility of the act named (for example, making a circle with unequal radii), in the nature of God (for example, God cannot intend evil), in the nature of that over which divine power is exercised (for example, God cannot create a creature’s creative act), or in the particular relations that God has with the creatures (for example, God cannot act indirectly). It is a legitimate question what it means to speak of attributing the highest degree of power to God apart from a system of metaphysical principles. It is not that a particular metaphysic is a final court of appeal for a concept of divine power; on the other hand, an appeal to divine power may be no more than a deus ex machina apart from a well-articulated metaphysic.

5. Divine Knowledge

One of the lessons to be learned from debates about divine power is that one’s ideas about God have implications for one’s ideas about the world and vice versa. To assume that God can bring about any logically possible state of affairs presupposes that all states of affairs are such that, in principle, they require only a single being to bring them about. That presupposition, however, begs the question against a world-view like Hartshorne’s in which reality has a social structure. In such a world, it is no limit on God if God cannot bring about every logically possible state of affairs. There is an analogous lesson where divine knowledge is concerned. If reality is continually in-the-making, as Hartshorne maintains, then there is a fundamental asymmetry between past and future. The past is fully determinate and the future is the realm of the partially indeterminate. If God is all-knowing, then God must know the future for what it is, as partially indeterminate. If one raises the objection that such a deity is not omniscient because the future is partially hidden from it, one has failed to cross the pons asinorum of the debate. It is a defect in divine knowledge not to know a fully determinate future only if there is a fully determinate future to be known. The assumption of a fully determinate future is evident in the use that Aquinas makes of the analogy made famous by Boethius: as each point on the circumference of a circle is equidistant from the center, so God is equally knowledgeable of every moment of time (SCG I, 66.7; see also Boethius, Consolation of Philosophy, Bk 4, Pr. 6). As Hartshorne noted, however, the analogy assumes that time can be represented as a completed whole, whereas time may be more like an endless line whose points are added from moment to moment.

Hartshorne’s criticism of the circle analogy was anticipated by late medieval philosophers like John Duns Scotus (Ordinatio I, d. 39, q. 1-5) and Luis de Molina (Condordia IV, d. 49.18). The questions raised by the circle analogy concern not only the nature of time, but also the nature of God. Traditional theists were reluctant to attribute any passive potency to God; they thought that the perfection of the divine being required that God be immutable and impassible. If, however, God is not affected by anything external, then how is it that God knows the world? Aquinas answered that the cognitive relation in God is the reverse of what it is for humans. We know the world because it affects us, but God knows the world because God is its creator. The Thomistic solution may preserve divine impassibility but at the expense of making human freedom problematic. This problem was discussed in the previous section. There was, however, another very imaginative solution to the “mechanics of omniscience” given by Molina. He argued that, prior to creating the world, God has knowledge of what any possible free creature would do in any particular circumstance. Using this “middle knowledge” in combination with the knowledge of what creatures God has in fact chosen to create, God is able to know what every free creature will do in the circumstances where they have been placed.

New life was breathed into Molinism by analytic philosophers of religion in the late twentieth century. For his part, Hartshorne never directly addressed Molina’s theory. It is easy enough, however, to reconstruct a Hartshornean response to Molinism. Above all, it is important to appreciate that, of necessity, the logical subjects of God’s middle knowledge are possible persons. God’s knowledge of what would be the case for any free creature is pre-volitional; that is to say, God knows, prior to creating, what any creature, whether it is eventually created or not, would do under any given circumstance. Middle knowledge cannot serve to guide God’s providential decisions about which world to create if it depends upon which world God creates. For this reason, the usual characterization of middle knowledge as “counter-factuals of freedom” is seriously misleading. Prior to God’s decision to create a world, there are no creatures and, hence, no fact of the matter about any actual creature. There are only possible creatures. Hartshorne denied the existence of possible non-actual individuals. In Man’s Vision of God and the Logic of Theism he wrote, “There is an unutilized possibility of individuals, but not an individuality of unutilized possibility.” (See also, “Charles’s Hartshorne: Neoclassical Metaphysics.”) Given these views, it is clear that Hartshorne would reject Molinism.

There is a hint of irony in claiming to know what Hartshorne would say about middle knowledge. Does this not presuppose a kind of middle knowledge of Hartshorne? In view of what was just said about the logical subjects of middle knowledge, the answer to this question should be obvious. Hartshorne was not a possible person; he was a real person whose views on various philosophical topics were clearly stated. The argument is this: Molinism entails belief in possible persons; Hartshorne denied the existence of possible persons; therefore, Hartshorne would deny Molinism. This argument points to one of the most puzzling features of Molinism, to wit, that middle knowledge is not grounded in fact. Hartshorne’s developmental and cumulative view of process permits speculation about what a given actual person would or might do under various sets of circumstances. These “would be” and “might be” statements are grounded in the world-historical process itself, including a person’s character as so far formed or (as in Hartshorne’s case) as it was formed. Hartshorne made precisely this point in his response to Robert Kane in the Library of Living Philosophers volume devoted to Hartshorne’s work. For Hartshorne, God’s knowledge of the world is similar to our knowledge in that it requires a real relation from the object of knowledge to the knower. The difference, in God’s case, is that divine knowledge is eminent—God perfectly knows the extent to which the future is open or closed at any particular juncture of the creative advance.

A subtle objection to Hartshorne’s theory of omniscience is that it represents God as ignorant of certain truths. To be sure, the neoclassical God perfectly knows the past—what did or did not happen—but does God, as so conceived, know everything that will or will not happen? Consider a person, P, at time T1 as yet undecided about a difficult choice: will P choose B or not-B? Let us suppose that at T2 the person decides B. On Hartshorne’s account, God knows at T2 that P chooses B, but God does not know at T1 that P will choose B. The argument can be further refined: an omniscient being knows all truths; at T1, either “P will choose B” or “P will choose not-B” is true; the neoclassical God does not know at T1 which of the statements is true; therefore, this God is not omniscient.

Hartshorne’s initial response to this objection, in a 1939 article, was to argue, in effect, that there are three truth values: true, false, and indeterminate. According to this view—which may have been Aristotle’s—future tense statements have an indeterminate truth value. Hartshorne was unhappy with this idea because it requires abandoning the law of excluded middle; if p concerns a future event, then “p or not-p” is best construed as indeterminate rather than (as in traditional logic) a tautology. In Man’s Vision of God and the Logic of Theism, Hartshorne hit upon a different response to the argument, one which he would develop more fully in an article in Mind in 1965 (reprinted in Creative Experiencing). Hartshorne’s mature position was to argue that “P will choose B” and “P will choose not-B” are best construed as contraries rather than contradictories. The strict contradictory of “P will choose B” is “P may not choose B” and the strict contradictory of “P will not choose B” is “P may choose B.” The statements forms in the triad—“P will choose B,” “P will not choose B” and “P may or may not choose B”—are mutually exclusive: if one is true the other two are false. In this way, Hartshorne preserves the law of excluded middle as to truth values while allowing for the openness of the future.

Since, on Hartshorne’s view, “will” and “will not” statements are contraries, it is incorrect to represent them in the sentential meta-language as, respectively, p and not-p. Rather, “X will occur” and “X will not occur” should be represented as p and q, where ~ (p & q) (that is, “not-(p and q)”). A similar mapping of object language expressions onto sentential meta-language is needed in other domains as when one represents the pairs of contraries, “commands X” vs. “forbids X” or “legally requires X” vs. “legally requires not-X”: the remaining alternative in each case, respectively, is “makes no command with respect to X” and “there is no legal requirement with respect to X.” The metaphysical underpinning of Hartshorne’s proposed semantics of future tense statements is his indeterminism, according to which past causal conditions require (X will occur), exclude (X will not occur), or permit (X may or may not occur) various effects in the future.

Anticipating an objection, Hartshorne admits that it seems paradoxical to say that “X will occur,” as a prediction, is false even when X in fact occurs. Hartshorne replies that the “paradox” may be no more problematic than the familiar fact that a false scientific law can be verified (or corroborated). This is simply one more instance of the so-called paradox of material implication. We accept that “if p then q” is true when p is false, even if this seems counter-intuitive. The paradox dissolves upon the realization that any other truth functional definition of the conditional besides the standard one—“if p then q” is equivalent by definition to “not-p or q”—yields manifestly invalid inferences. Hartshorne takes a clue from Popper and says that the decisive operation where “will be” statements are concerned is falsification. “X will occur” is shown to be false when X does not occur, but it is not shown to have been true when X occurs. Hartshorne’s view requires that, in the strictest philosophical sense, “will be” statements are disguised “must be” statements. Intuitions among competent speakers of the language differ on this point so it is reasonable not to expect the issue to be decided by ordinary language. When Scrooge, in Dickens’ A Christmas Carol, asks the Ghost of Christmas Future whether he is seeing the shadows of the things that “will be” or the shadows of the things that “may be only,” he is expressing in a precise way Hartshorne’s analysis of future tense statements. If the shadows are of the things that “will be,” then all hope is lost, but if they are the shadows of the things that “may be only” then Scrooge can change his ways and make for himself a different future.

Our discussion to this point has followed philosophical orthodoxy by focusing on whether God knows the truth values of propositions. For Hartshorne, however, this question is secondary, for there is more to knowledge than knowledge that a proposition is true. In The Principles of Psychology, William James, following John Grote distinguished, “knowledge of acquaintance” and “knowledge-about,” a distinction later made famous by Bertrand Russell who spoke of “knowledge by acquaintance” and “knowledge by description.” To have information about something or someone is not the same as having first-hand awareness of them. The two sorts of knowledge are related as more abstract to more concrete. It is one thing to read about a battle, quite another to have experienced it for oneself. Moreover, as a general rule, the more abstract the knowledge, the more emotionally detached it can be. The basic form of knowledge that Hartshorne attributes to deity is direct acquaintance through the affective bonds of feeling; Hartshorne adopts Whitehead’s term “prehension” for the most concrete facts of relatedness among dynamic singulars. If God’s knowledge is prehensive, it is perhaps easier to understand why Hartshorne resists the idea that God knows the future as determinate: no one is acquainted with the future; at best one has knowledge of acquaintance of the future as an array of tendencies towards actualization or as possibilities entertained. Moreover, conceiving God’s relations with the creatures as prehensive places emphasis on the affective dimension of divine knowing. God’s knowing, as feelings of the feelings of others can then be conceived as a form of caring.

Hartshorne’s theory posits God’s perfect knowledge of the future as relatively indeterminate and of the past as determinate. Yet, the past, even if it is determinate as Hartshorne claims, is no longer. Does this mean that God also lacks knowledge of acquaintance with the past? Hartshorne answers in the negative and it is important to understand his reasons. A creature, having specific spatio-temporal location, has acquaintance with at most a vanishingly small segment of events in space and time, and even that knowledge is shot through with fallibility. Most of our knowledge of the past is through inference and by description. We know by acquaintance with the past we have lived, but most of our knowledge of the past is about the past. God’s knowledge is both quantitatively and qualitatively different. Divine experience encompasses everything that has ever come to pass. As a localized individual has acquaintance with its past, God, in an analogous fashion, has acquaintance with all that is past. Divine knowledge, moreover, not only knows all of the past but knows it with perfect adequacy. God’s is the eminent form of prehension. On Hartshorne’s principles, the distant past must be as vivid for God as the recent past. In other words, the past does not “fade” for God. The difference, for God, between distant past events and recent ones is in the knowledge that recent events were preceded by the distant ones whereas there was a time when the recent events were, at best, outlines of what could be relative to distant past events.

The extent of God’s knowledge of the past is a point of contention between Whitehead (or Whiteheadians) and Hartshorne. In the concluding lines of Process and Reality, Whitehead speaks of how creaturely achievements, though transient, are everlastingly remembered by God, making them objectively immortal. The “unfading importance of our immediate actions” are said to “perish and yet live for evermore.” In the Library of Living Philosophers volume on Hartshorne, Lewis Ford interprets Whitehead to mean that each actual occasion (Hartshorne’s dynamic singulars) undergoes a two stage process, its coming-to-be (during which it is a subject of experience) and its objectification (in which it ceases to be a subject of experience) in the coming-to-be of subsequent occasions. According to Ford, it was Whitehead’s “momentous discovery” in metaphysics that the subject/object distinction is a difference in temporal modality; that is to say, an occasion’s status in the present, as it comes to be, is to be a subject, but as past it is an object. Hartshorne agrees with much of this analysis, but he objects to Whitehead’s metaphor of perishing. Hartshorne contends that the objects that are prehended by subsequent occasions are past subjects. If the being of an actual entity is constituted by its becoming, as Whitehead says (and Hartshorne agrees), then God’s prehension of an occasion is precisely God’s feeling of that occasion’s feelings. What exists everlastingly in the divine memory is not merely a knowledge that a dynamic singular felt in a particular way, but an acquaintance with how it felt. Hartshorne likens God’s memory of a person’s experiences to the person’s own vivid recollection of their past experiences.

6. Panentheism

A distinctive feature of Hartshorne’s theism, and one that sets it apart from Whitehead’s theism, is that God includes the universe in a way that bears a distant analogy with the way that a person includes his or her body. Until 1941 Hartshorne spoke of a “new pantheism,” but afterwards he spoke of panentheism, meaning that all (pan) is in (en) God (theos). Hartshorne cited Plato’s World-Soul analogy in some of the later dialogues as an anticipation of panentheism. Hartshorne, however, divests the doctrine of any vestige of mind-body dualism. God is not an immaterial entity haunting the universe; rather, as Hartshorne says in Omnipotence and Other Theological Mistakes, God is “the individual integrity of ‘the world,’ which otherwise is just the myriad creatures.” Hartshorne relies on modern cell theory for an analogy which, of course, was unavailable to Plato. Every localized dynamic singular is, as it were, a cell, in the body of God. An important disanalogy is that the universe, unlike a body within the universe, has no environment external to itself. Thus, in the divine case, the “body” of God and the “environment” in which God operates are one and the same. Hartshorne expresses this idea by saying that God’s “environment” is wholly internal. He adds that the disanalogy explains why there are no specialized organs—such as liver, heart, and brain—in the divine body as there must be in a localized body. Specialized organs allow a localized body to monitor itself in its relation to its environment, but there is no other environment for God to negotiate except the universe. Dombrowski rightly says that, for Hartshorne, it is as true to say that the cosmos is ensouled as to say that God is embodied (Dombrowski 1996, 86).

Hartshorne also used analogies of persons related to persons as symbolic language for the relationship between God and the creatures. He was deeply critical, however, of the male bias of traditional theology. The few female metaphors used for God in the Bible, for example, were overshadowed by the dominance of male images—Lord of Hosts, Father, King—which reinforced patriarchal attitudes. Hartshorne considered himself a feminist. Sometime in the late 1970s or early 1980s, he was alerted to the problem of sexism in language and so he began using inclusive language as one can see in Omnipotence and Other Theological Mistakes and elsewhere. He said that, in retrospect, it would have been better had his early book Man’s Vision of God been titled Our Vision of God. (Auxier and Davies 2001, 159). Hartshorne’s feminism is also apparent in a variation he gives to panentheism. He argued that the relationship between mother and fetus is decidedly more intimate than the relation between father and fetus. Thus, for some purposes, the analogy of a pregnant mother for the relation between God and the creatures is preferable to any male counterpart. Of course, the pregnancy analogy, like all symbolic language for deity, has a restricted use. Nevertheless, re-imagining God as a woman is a useful reminder of the male bias of traditional theology and it helps to highlight aspects of the God/World relationship that were obscured by that bias.

Analogies like World-Soul, person-cell, or pregnancy, are at best distant approximations for the relationship of God to the world. As metaphors they are literally false, but they are aids in understanding what Hartshorne has in mind when he says that God includes the world. Hartshorne’s argument for panentheism is disarmingly simple: If God is the greatest conceivable reality, then God must include all that is valuable in the universe. Otherwise, there would be a reality greater than God, namely, the universe-plus-God. Could God include what is valuable in the universe without including the universe? Hartshorne does not think so. Each dynamic singular that comes to be is not simply an additional fact; it is, by virtue of Hartshorne’s panexperientialist psychicalism also a value-achievement, and that value-achievement is greater in more complex organisms. This article has previously used the examples of Mozart and Beethoven as introducing new values into the universe, but other examples are legion. The sum total of value in the universe, which is inseparable from the dynamic singulars that comprise it, is ever increasing according to Hartshorne’s process-relational metaphysic. It must therefore be included within God if God is to be conceived as the reality than which none is greater.

Norris Clarke says that medieval philosophers anticipated Hartshorne’s argument and replied to it (Clarke 1990, 108). They said that the reality described by “God plus the universe” involves more beings in a quantitative sense, but not greater perfection of being in a qualitative sense. More precisely, says Clarke, “God plus the universe” means that there are more sharers in being. All value is in God, and the creatures merely share or participate in that value. By way of analogy, Clarke says that a mathematician may impart her knowledge to her students. Once the students learn what the teacher has to teach there is not more knowledge in the class, there is only more of those sharing in the knowledge. A different analogy, however, could be used to bring out the distinctiveness of Hartshorne’s view. A music teacher may provide her class with the basics of theory and composition, but the students can create new musical pieces, each with a value of its own. In this example, there are not simply more sharers of being, but more creators of value. The medieval response that Clarke gives is defective, on Hartshorne’s reckoning, at precisely the point that process-relational theology departs from classical theism: the universe is not simply a product of divine creativity but of multiple creative agents. Classical theism had the unhappy consequence of divesting the creatures of any value that is their own, except for what is on loan from God. The sum-total value or perfection of existence is the same whether or not the creatures exist. For this reason, Hartshorne considered his panentheism to give a better account than classical theism of what it means to serve God. If the value in a creature is wholly borrowed from God, then the individual can offer God nothing that did not already belong to God by natural endowment. For Hartshorne, on the other hand, the creatures may be imperfect, but they are not mere conduits for values that God already possesses. On the contrary, their value contributes to that of God—hence, Hartshorne’s expression, “contributionism.”

A question that Hartshorne raised in Man’s Vision of God and the Logic of Theism and that he discussed with E. S. Brightman in their correspondence was whether it is possible for God to include individuals that hold erroneous beliefs without also holding those beliefs. Put somewhat differently, if different individuals hold contradictory beliefs and God “includes” those individuals and their beliefs, does God hold contradictory beliefs? Similar puzzles can be raised about God’s inclusion of individuals who commit terrible crimes—is the evil of the criminal deed a property of God? Or again, can God include creatures who are anxious about their death without also being anxious about death? Hartshorne replies that the logic of parts and wholes is such that they do not necessarily share properties—for example, a sand dune is not the size of a grain of sand even though it is made of grains of sand. Each part of the universe, Hartshorne holds, is a dynamic singular with an activity of its own that is not simply the activity of the universe as a whole (this is another way of expressing indeterminism). By parity of reasoning, these centers of individual activity, or the organisms of which they are parts, can have properties (such as false beliefs, evil deeds, or anxiety about death) that are not shared by the whole. A person can remember formerly holding a false belief or doing something wrong; God, by analogous extension, can prehend—that is, make part of the divine life—the errors and sins of the creatures without thereby being in error or sinning. It is important to add that while Hartshorne denies that God is the author of creaturely lack of wisdom and virtue, God nevertheless suffers their negative effects. In Creativity in American Philosophy, Hartshorne maintains that God feels how others feel without feeling as they feel (1984, 199).

Two advantages of panentheism, as Hartshorne argues for it, are that it provides a ready argument in support of monotheism and it addresses the empiricist challenge of how to identify the referent of the word God. If God is an all-inclusive reality, then there can be only one God because there can be only one all-inclusive reality. In “Synthesis as Polydyadic Inclusion,” Hartshorne defines inclusion in these terms: if X includes Y, then X + Y = X (1976, 247). If X denotes God and Y denotes the universe, then God, plus the universe is God. The argument that there could not be two all-inclusive deities is this: suppose W and X are two all-inclusive deities; this means that each must include the other. That is to say, W + X = X and W + X = W, but in that case, W = X. As for the empiricist challenge, the conditions for the identification of the panentheistic God are not the same as would be required to identify a localized being. Individuals within the cosmos occupy a tiny portion of the universe for a vanishingly brief period. Their influence is felt locally but not universally. God, on the other hand, is affected by all and affects all. As Hartshorne says, God is the one individual with strictly universal functions (1948, 31; 1967, 76). From this, he infers that God is the one individual identifiable, or picked out, by concepts alone. Other individuals have properties that might have been had by others (for example, Obama was the Democratic candidate for President in 2008, but he need not have been) and the properties they actually have might have been different (for example, Hillary Clinton was born in Chicago, but she could have been born elsewhere). The formal properties of God as all inclusive are unique to God: no other individual has universal functions. One might search the earth for Obama or Clinton, but it would be profoundly misguided to search the earth, or the cosmos, for God. The description of God in the book of Acts is applicable to Hartshorne’s panentheism: God is the one “in whom we live, move, and have our being.”

7. Conclusion

The amount of energy that Hartshorne devoted to questions surrounding the nature and existence of God might lead one to classify him as a theologian. Yet, his defense of dipolar theism presupposes no sectarian dogma, makes no appeals to “revealed” truths or books, and privileges no mystical experience. There can be no question that he was first and foremost—as he himself emphasized—a philosopher. Various ideas about deity that he defended, most notably his critique of divine immutability and impassibility, have been widely influential although few would be willing to call themselves Hartshorneans. A case in point is the late William P. Alston who had been a student in Hartshorne’s class and who, late in his career, attempted to find a mediating position between Hartshorne and Aquinas. Another example of Hartshorne’s influence is that he spoke explicitly of “the openness of God” fully thirty years before that expression was adopted by a group of evangelical Christians to describe a deity open to creaturely influence and that faces a relatively open future. Some of the major figures in that movement—William Hasker, Gregory Boyd, and Richard Rice—acknowledge a debt to Hartshorne’s arguments for conceiving God in relational terms even as they distance themselves from the heterodox elements of his thinking. One may also mention Hartshorne as a pioneer who contributed to the recent widespread interest among philosophers of religion in panentheism. Carol Christ, long at the forefront of feminist theology, sees in Hartshorne’s work philosophically sophisticated ways of “re-imagining the divine in the world.”

Although he was a philosopher, Hartshorne’s work has attracted the attention of theologians. In 1973, a volume devoted to his thought was published in a series titled, “Makers of the Modern Theological Mind.” Many theologians, such as Schubert Ogden (who studied with Hartshorne at Chicago), Marjorie Suchocki, Sheila Devaney, Anna Case-Winters, and Theodore Walker, Jr., have critically appropriated Hartshorne’s philosophical theology. John B. Cobb Jr. (who also studied with Hartshorne at Chicago), once commented that it is often the case that a philosopher that gains a following among theologians is regarded with suspicion by other philosophers. This tendency may be less prominent since the resurgence of interest in philosophy of religion in the closing decades of the twentieth century. Of course, Hartshorne was active throughout the century, vigorously defending the rationality of dipolar theism in the heyday of the Vienna Circle. At a time when religious discourse was widely regarded as nonsensical, Hartshorne met and challenged the positivists on their own terms. It is fair to say that Hartshorne was influenced by his Chicago colleague Rudolf Carnap in his insistence on high standards of logical rigor. Carnap was, in turn, constructively engaged with Hartshorne’s work. Carnap was reportedly intrigued by Hartshorne’s formal reduction to absurdity disproof of the coherence of classical attributes of deity as developed in The Divine Relativity; he worked with Hartshorne closely on the technical appendix to Chapter II on “Relativity and Logical Entailment” in The Divine Relativity.

Hartshorne’s development of a philosophical theology according to which God is transcendent yet inseparable from temporal processes is arguably one of his lasting achievements. His defense of divine relativity may well be the single most important factor in dissolving the near consensus that once prevailed that an entirely unchanging and eternal deity should be considered normative for theology. He considered the deity of the classical tradition as at once too active and too passive. It is too active in the sense that nothing falls outside its control; the creatures are left to unwittingly play roles decided for them in eternity—“imitations of life” as Jules Lequyer called them. It is too static in the sense that it cannot change or be affected by the triumphs and tragedies of the creatures. In short, it is a deity that acts but is never acted upon and can therefore never interact. This is captured in the Aristotelian formula that was borrowed and reinterpreted by medieval thinkers to denote the God of the Abrahamic traditions: God as the “unmoved mover.” In a discussion of Mortimer Adler’s use of this formula, Hartshorne once called it a half-truth parading as the full truth. Hartshorne admired Abraham Heschel for reversing this idea by calling God the “most moved mover” (a phrase later adopted by Clark Pinnock). Hartshorne amended this formula to distill the essence of dipolar or neoclassical theism: God is the most and best moved mover.

8. Suggestions for Further Reading

a. Primary Sources

i. Books (in order of appearance)

  • Hartshorne, Charles. 1941. Man’s Vision of God and the Logic of Theism. Chicago: Willett, Clark and Company.
  • Hartshorne, Charles. 1948. The Divine Relativity: A Social Conception of God. New Haven. Connecticut: Yale University Press.
  • Hartshorne, Charles. 1953. Reality as Social Process: Studies in Metaphysics and Religion. Boston: Beacon Press.
  • Hartshorne, Charles and William L. Reese, eds. 1953. Philosophers Speak of God. University of Chicago Press. Republished in 2000 by Humanity Books.
  • Hartshorne, Charles. 1962. The Logic of Perfection and Other Essays in Neoclassical Metaphysics. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1965. Anselm’s Discovery: A Re-examination of the Ontological Proof for God’s Existence. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1967. A Natural Theology for Our Time. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1976. Aquinas to Whitehead: Seven Centuries of Metaphysics of Religion. Milwaukee, Wisconsin: Marquette University Publications.
  • Hartshorne, Charles. 1984. Creativity in American Philosophy. Albany: State University of New York Press.
  • Hartshorne, Charles. 1984. Omnipotence and Other Theological Mistakes. Albany: State University of New Press.
  • Hartshorne, Charles. 1987. Wisdom as Moderation: A Philosophy of the Middle Way. Albany: State University of New York Press.
  • Hartshorne, Charles. 1997. The Zero Fallacy and Other Essays in Neoclassical Philosophy, edited by Mohammad Valady. Peru, Illinois: Open Court Publishing Company.
  • Hartshorne, Charles. 2011. Creative Experiencing: A Philosophy of Freedom, edited by Donald W. Viney and Jincheol O. Albany: State University of New York Press.
  • Auxier, Randall E. and Mark Y. A. Davies, editors. 2001. Hartshorne and Brightman on God, Process, and Persons: The Correspondence, 1922-1945. Nashville: Vanderbilt University Press.
  • Viney, Donald W., guest editor. 2001. Process Studies, Special Focus on Charles Hartshorne, 30/2 (Fall-Winter)
  • Viney, Donald W., guest editor. 2011. Process Studies, Special Focus Section: Charles Hartshorne, 40/1 (Spring/Summer): 91-161.

ii. Hartshorne’s Response to his Critics

  • Cobb, John B. Jr. and Franklin L Gamwell, editors. 1984. Existence and Actuality: Conversations with Charles Hartshorne. Chicago: University of Chicago Press.
  • Hahn, Lewis Edwin, editor. 1991. The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. La Salle, Illinois: Open Court.
  • Kane, Robert and Stephen H. Phillips, editors. 1989. Hartshorne, Process Philosophy and Theology. Albany State University of New York Press.
  • Sia, Santiago, editor. 1990. Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Dordrecht, the Netherlands: Kluwer Academic Publishers.

iii. Selected Articles

  • Hartshorne, Charles. 1945. Entries for “Eternal” (256), “Eternity” (257), “Foreknowledge, Divine” (284), “Omniscience” (546-47), “time” (787-88), “transcendence” (791-92) in An Encyclopedia of Religion, ed. Vergilius Ferm. New York: Philosophical Library.
  • Hartshorne, Charles. 1950. “The Divine Relativity and Absoluteness: A Reply [to John Wild].” Review of Metaphysics 4, 1: 31-60.
  • Hartshorne, Charles.1966. “A New Look at the Problem of Evil,” Current Philosophical Issues: Essays in Honor of Curt John Ducasse, edited by Frederick C. Dommeyer. Springfield, Illinois: Charles C. Thomas: 201-212.
  • Hartshorne, Charles. 1967. “Religion in Process Philosophy,” Religion in Philosophical and Cultural Perspective: A New Approach to the Philosophy of Religion Through Cross Disciplinary Studies, edited by J. Clayton Feaver and William Horosz. Princeton, New Jersey: D. Van Nostrand Company, Inc.: 246-268.
  • Hartshorne, Charles. 1967. “The Dipolar Conception of Deity.” Review of Metaphysics 21, 2: 273-89.
  • Hartshorne, Charles. 1969. “Divine Absoluteness and Divine Relativity.” Transcendence, eds. Herbert W. Richardson and Donald R. Cutler. Boston: Beacon: 164-71.
  • Hartshorne, Charles. 1971. “Could There Have Been Nothing? A Reply [to Houston Craighead].”  Process Studies 1, 1: 25-28.
  • Hartshorne, Charles. 1976. “Synthesis as Polydyadic Inclusion: A Reply to Sessions’ Charles Hartshorne and Thirdness,” Southern Journal of Philosophy 14/2: 245-55.
  • Hartshorne, Charles. 1977. “Bell’s Theorem and Stapp’s Revised View of Space-Time.” Process Studies 7/3 (Fall): 183-191.
  • Hartshorne, Charles. 1978. “Theism in Asian and Western Thought.” Philosophy East and West 28, 4: 401-11.
  • Hartshorne, Charles. 1980. “Mysticism and Rationalistic Metaphysics.” Understanding Mysticism, edited by Richard Woods. Garden City, New York: Image: 415-421.
  • Hartshorne, Charles. 1984. “Toward a Buddhisto-Christian Religion.” Buddhism and American Thinkers, edited by Kenneth K. Inada and Nolan P. Jacobson. Albany State University of NewYork Press: 1-13.
  • Hartshorne, Charles. 1992. “The Aesthetic Dimensions of Religious Experience.” Logic, God and Metaphysics, edited by James Franklin Harris. Dordrecht: Kluwer Academic Publishers: 9-18.
  • Hartshorne, Charles.1993. “Can Philosophers Cooperate Intellectually: Metaphysics as Applied Mathematics.” The Midwest Quarterly 35/1 (Autumn): 8-20.

b. Secondary Sources

  • Blanchette, Oliva. 1994. “The Logic of Perfection in Aquinas.” Thomas Aquinas and His Legacy. Edited by David M. Gallagher. Studies in Philosophy and the History of Philosophy, Volume 28. Washington, D.C.: The Catholic University of America Press: 107-130.
  • Boyd, Gregory A. Trinity and Process: A Critical Evaluation and Reconstruction of Hartshorne’s Di-Polar Theism Towards a Trinitarian Metaphysics. New York: Peter Lang, 1992.
  • Burrell, David B. 1982. “Does Process Theology Rest on a Mistake?” Theological Studies 43/1 (March): 125-135.
  • Case-Winters, Anna. 1990. God’s Power: Traditional Understandings and Contemporary Challenges. Louisville, Kentucky: Westminster/John Knox Press.
  • Christ, Carol P. 2003. She Who Changes: Re-Imagining the Divine in the World. New York: Palgrave Macmillan.
  • Clarke, Bowman. 1966. Language and Natural Theology. The Hague: Mouton & Co.
  • Clarke, Bowman. 1995. “Two Process Views of God.” God, Reason and Religions: New Essays in the Philosophy of Religions. Edited by Eugene Thomas Long. Dordrecht: Kluwer Academic Publishers: 61-74.
  • Clarke, W. Norris. 1990. “Charles Hartshorne’s Philosophy of God: A Thomistic Critique,” Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Edited by Santiago Sia. Dordrecht: Kluwer Academic Publishers: 103-23.
  • Davaney, Sheila Greeve. 1986. Divine Power: A Study of Karl Barth and Charles Hartshorne. Harvard Dissertations in Religion, number 19. Philadelphia: Fortress Press.
  • Dombrowski, Daniel A. 1996. Analytic Theism, Hartshorne, and the Concept of God. Albany: State University of New York Press.
  • Dombrowski, Daniel A. 2004. Divine Beauty: The Aesthetics of Charles Hartshorne. Nashville, Tennessee: Vanderbilt University Press.
  • Enxing, Julia and Klaus Müller, editors. 2011. Perfect Changes: Die Religionsphilosophie Charles Hartshornes. Regensburg: Friedrich Pustet.
  • Enxing, Julia. 2013. Gott im Werden. Die Prozesstheologie Charles Hartshorne. Regensburg: Friedrich Pustet.
  • Fitzgerald, Paul. 1972. “Relativity Physics and the God of Process Philosophy.” Process Studies 2/4 (Winter): 251-276.
  • Ford, Lewis S. 1968. “Is Process Theism Compatible with Relativity Theory?” Journal of Religion 48/2 (April): 124-135.
  • Geisler, Norman L. 1976. “Process Theology.” Tensions in Contemporary Theology. Edited by Stanley N. Gundry and Alan F. Johnson. Chicago: Moody Press: 235-284.
  • Alan Gragg. 1973. Charles Hartshorne, Maker of the Modern Theological Mind, edited by Bob E. Patterson. Waco, Texas: Word Books Publisher.
  • Griffin, David Ray, John B. Cobb Jr., Marcus P. Ford, Pete A. Y. Gunter, and Peter Ochs. 1993. Founders of Constructive Postmodern Philosophy: Peirce, James, Bergson, Whitehead, and Hartshorne. Albany: State University of New York Press.
  • Gruenler, Royce Gordon. 1983. The Inexhaustible God: Biblical Faith and the Challenge of Process Theism. Grand Rapids, Michigan: Baker Book House.
  • Gunton, Colin E. 1978. Becoming and Being: The Doctrine of God in Charles Hartshorne and Karl Barth. Oxford University Press.
  • James, Ralph E. 1967. The Concrete God, A New Beginning for Theology—The Thought of Charles Hartshorne. Indianapolis, Indiana: The Bobbs-Merrill Company.
  • Kachappilly, Kurian. 2002. God of Love: A Neoclassical Inquiry. Bangalore, India: Dharmaram Publications.
  • Moskop, John C. 1984. Divine Omniscience and Human Freedom: Thomas Aquinas and Charles Hartshorne. Foreword by Charles Hartshorne. Macon, Georgia: Mercer University Press.
  • Myers, William, guest editor. 1998. The Personalist Forum, Special Issue on Charles Hartshorne, 14/2 (Fall).
  • Nash, Ronald H. editor. 1987. Process Theology. Grand Rapids, Michigan: Baker Book House.
  • Neville, Robert C. 1980. Creativity and God: A Challenge to Process Theology. New York: The Seabury Press.
  • Neville, Robert C. 2009. Realism in Religion: A Pragmatist’s Perspective. Albany: State University of New York Press.
  • Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press.
  • Pratt, Douglas. 2002. Relational Deity: Hartshorne and Macquarrie on God. Lanham, Maryland: University Press of America.
  • Ramal, Randy, editor. 2010. Metaphysics, Analysis, and the Grammar of God: Process and Analytic Voices in Dialogue .Tübingen, Germany: Mohr Siebeck.
  • Sanders, John. 2007. The God Who Risks: A Theology of Divine Providence, revised edition. Downers Grove, Illinois: IVP Academic.
  • Shields, George W. 1983. “God, Modality and Incoherence.” Encounter 44/1: 27-39.
  • Shields, George W. 1992. “Hartshorne and Creel on Impassibility,” Process Studies 21/1 (Spring): 44-59.
  • Shields, George W. 1992. “Infinitesimals and Hartshorne’s Set-Theoretic Platonism” The Modern Schoolman 49/2 (January): 123-134.
  • Shields, George W. 2003. “Omniscience and Radical Particularity: Reply to Simoni,” Religious Studies 39/2 (October).
  • Shields, George W. 2009. “Quo Vadis?: On Current Prospects for Process Philosophy and Theology,” The American Journal of Theology & Philosophy, 30/2 (May).
  • Shields, George W. 2010. “Eternal Objects, Middle Knowledge, and Hartshorne: A Response to Malone-France,” Process Studies, 39/1 (Spring/Summer): 149-165.
  • Shields, George W. 2010. “Panexperientialism, Quantum Theory, and Neuroplasticity” in Process Approaches to Consciousness, eds. Michel Weber and A. Weekes. (Albany: State University of New York Press).
  • Shields, George W., editor. 2003. Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Albany: State University of New York Press.
  • Sia, Santiago. 1985. God in Process Thought: A Study in Charles Hartshorne’s Concept of God. Postscript by Charles Hartshorne. Dordrecht, the Netherlands: Martinus Nijhoff.
  • Sia, Santiago. 2004. Religion, Reason and God: Essays in the Philosophy of Charles Hartshorne and A. N. Whitehead. Frankfurt am Main: Peter Lang.
  • Sia, Santiago, editor. 1986. Process Theology and the Christian Doctrine of God, special edition of Word and Spirit, a Monastic Review, 8. Petersham, Massachusetts: St. Bede’s Publications.
  • Simoni-Wastila, Henry. 1999. “Is Divine Relativity Possible? Charles Hartshorne on God’s Sympathy with the World.” Process Studies 28/1-2 (Spring-Summer): 98-116.
  • Sprigge, T. L. S. 2006. The God of Metaphysics. Oxford: Clarendon Press.
  • Suchocki, Marjorie Hewitt and John B. Cobb, Jr. editors. 1992. Process Studies, Special Issue on the Philosophy of Charles Hartshorne, 21/2 (Summer).
  • Towne, Edgar A. 1997. Two Types of Theism: Knowledge of God in the Thought of Paul Tillich and Charles Hartshorne. New York: Peter Lang.
  • Viney, Donald Wayne. 1985. Charles Hartshorne and the Existence of God. Albany State University of New York Press.
  • Viney, Donald Wayne. 1989. “Does Omniscience Imply Foreknowledge? Craig on Hartshorne.” Process Studies, 18/1 (Spring): 30-37.
  • Viney, Donald Wayne. 2000. “What is Wrong with the Mirror Image? A Brief Reply to Simoni-Wastila on the Problem of Radical Particularity,” Process Studies, 29/2 (Fall-Winter): 365-367.
  • Viney, Donald Wayne. 2005. “Hartshorne, Charles (1897-2000)” The Dictionary of Modern American Philosophers, edited by John R. Shook (London: Thoemmes Press): 1056-62.
  • Viney, Donald Wayne. 2006. “God as the Most and Best Moved Mover: Charles Hartshorne’s Importance to Philosophical Theology.” The Midwest Quarterly, 48/1: 10-28.
  • Viney, Donald Wayne. 2007. “Hartshorne’s Dipolar Theism and the Mystery of God.” Philosophia, 35: 341-350.
  • Wilcox, John T. 1961. “A Question from Physics for Certain Theists.” Journal of Religion 40/4 (October): 293-300.
  • Wood, Forest Jr. and Michael DeArmey, editors. 1986. Hartshorne’s Neo-Classical Theology. Tulane Studies in Philosophy, volume 34.

c. Bibliography

“Primary Bibliography of Philosophical Works of Charles Hartshorne” (compiled by Dorothy Hartshorne; corrected, revised, and updated by Donald Wayne Viney and Randy Ramal) in Herbert F. Vetter, editor, Hartshorne: A New World View: Essays by Charles Hartshorne (Cambridge, Massachusetts: Harvard Square Library, 2007): 129-160. Also published in Santiago Sia, Religion, Reason and God (Frankfurt am Main: Peter Lang, 2004): 195-223.

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Charles Hartshorne: Biography and Psychology of Sensation

HartshorneCharles Hartshorne is widely regarded as having been an important figure in twentieth century metaphysics and philosophy of religion. His contributions are wide-ranging. He championed the aspirations of metaphysics when it was unfashionable, and the metaphysic he championed helped change some of the fashions of philosophy. He counted some well-known scientists among his friends, and he embraced the deliverances of modern science (he never questioned, for example, the truth of evolution); however, he insisted that metaphysics and empirical science have different aims and methods, each ensuring in its own way a disciplined objectivity. His “neoclassical” or “process” metaphysics is in the same family of speculative philosophy that one finds in the works of Charles Sanders Peirce and the later writings of Alfred North Whitehead. Although he did not style himself a disciple of Peirce or of Whitehead, he made significant contributions to the study of these philosophers even as he developed his own views. Like them, he endeavored in his own metaphysical thinking to give full weight to the dynamic, relational, temporal, and affective dimensions of the universe. He emphasized, as few before him had, in logic and in the processes of nature, the foundational nature of asymmetrical relations.

Hartshorne was also a theist at a time when the coherence of theism was under attack from quarters as various as logical positivism and Sartre’s existentialism. Hartshorne’s name is inseparable from the revival of the ontological or modal argument for God’s existence, having devoted twenty-three articles and the better part of two books to the topic. He insisted, however, that it was unavailing to appeal to the ontological argument (or any theistic argument) as support for theism without first rethinking the concept of deity. He argued that thinking about God had been handicapped by lack of attention to the logically possible forms of theism, and in place of the unmoved mover of classical theology, he proposed “the most, and best, moved mover.” Hartshorne endorsed a “dipolar” version of theism according to which God is both necessary and contingent, but in different respects. Hartshorne sought a “panentheism” in which God includes the creatures without negating their distinctiveness. He argued that no putative inerrant revelation or infallible institution could negate the effects of the inherent fallibility of human knowledge. He occasionally worried that his “highly rationalized” form of theism would not have wide appeal; on the other hand, it was precisely a God of love and the love of God that were ever his “intuitive clue[s]” in philosophy. His ideas about deity influenced the philosophy of both religion and theology; Hartshorne had argued that it is necessary to take seriously an alternative to classical understandings of God that avoided their shortcomings while preserving their best insights.

Hartshorne did not devote all of his intellectual energies to metaphysics and philosophical theology. His first book, The Philosophy and Psychology of Sensation (1934), ventured empirical hypotheses about sensation, a subject to which he returned intermittently throughout his life. Also of note is his Born to Sing: An Interpretation and World Survey of Bird Song (1973), which established him as a serious ornithologist. What follows is an overview of Hartshorne’s life as well as a discussion of his first book and its relation to the larger themes of his philosophy.

Table of Contents

  1. Biography
  2. The Affective Continuum and the Psychology of Sensation
  3. Conclusion: Hartshorne’s Work on Sensation and the Rest of his Philosophy
  4. References and Further Reading
    1. Primary Sources
      1. Life
      2. Psychology of Sensation
    2. Secondary Sources
      1. Life
      2. Psychology of Sensation
    3. Bibliography

1. Biography

Charles Hartshorne (pronounced “Harts-horne”; literally, “deer’s horn”) was born June 5, 1897 in Kittanning, Pennsylvania, the second of six children of Francis Cope Hartshorne (1868-1950), an Episcopal minister, and Marguerite Haughton Hartshorne (1868-1959). He and his brother Richard (1899-1992)—who would achieve fame as a geographer—attended Yeates Boarding School (1911-1915), where he acquired a life-long interest in ornithology. Later, he attended Haverford College (1915-1917), where he was a student of the Quaker mystic Rufus Jones. With America’s entry into the First World War imminent, Hartshorne volunteered for the medical corps and spent the war years (1917-1919) in Le Tréport, France as an orderly in a British hospital.

What Hartshorne referred to as “the second period” of his intellectual development began when he enrolled at Harvard in 1919. He majored in philosophy and minored in English literature. Among his teachers were James Haughton Woods (named after Hartshorne’s maternal grandfather), W. E. Hocking, H. M. Sheffer, Ralph Barton Perry, C. I. Lewis, and the psychologist L. T. Troland. He completed the Ph.D. in 1923, writing a 306 page dissertation titled An Outline and Defense of the Argument for the Unity of Being in the Absolute or Divine Good. The broad outlines of his later thought are evident in the dissertation, but he never published any part of it. He would later remark, in Creativity in American Philosophy (1984), that it was a form of process philosophy that was “somewhat naïve and best forgotten.” Nevertheless, he was productive throughout his career, writing twenty-one books and over five hundred articles and reviews.

After graduation, Hartshorne returned to Europe as a Sheldon Traveling Fellow (1923-1925). He spent most of his time in Germany, but he also visited England, France, and Austria. He was fluent in German and spoke French reasonably well. His travels were rich with intellectual stimulation. In Europe he encountered many philosophical luminaries, including Moritz Schlick, Heinrich Gomperz, Lucien Lévy-Bruhl, Edouard Le Roy, Lucien Laberthonnière, Samuel Alexander, R. G. Collingwood, J. S. Haldane, G. E. Moore, G. F. Stout, Harold H. Joachim, Richard Kroner, Oskar Becker, Julius Ebbinghaus, Max Scheler, Max Planck, Adolf Harnack, Jonas Cohn, Paul Natorp, and Nicolai Hartmann. The most famous philosophers he met and with whom he studied were Edmund Husserl and Martin Heidegger. On his return to the United States, Hartshorne wrote the first English language review of Heidegger’s Sein und Zeit (Being and Time); the review appeared in the Philosophical Review and was published as part of the penultimate chapter of his second book.

Hartshorne was an Instructor and Research Fellow at Harvard (1925-1928) where he was simultaneously exposed to the two thinkers with whose philosophies he felt the most affinity: Charles Sanders Peirce (1839-1914) and Alfred North Whitehead (1861-1947). Boxes of Peirce’s unpublished manuscripts were donated to the Harvard library by Peirce’s widow, and Hartshorne was given the assignment of editing these papers. In 1927, Paul Weiss joined Hartshorne in the project. The Collected Papers of Charles Sanders Peirce was published in six volumes between 1931 and 1935 and would become the standard edition of Peirce’s work throughout the century.  Although Hartshorne published enough articles on Peirce to fill a book—a total of seventeen—neither he nor Weiss thought of becoming Peirce scholars. Hartshorne’s duties at Harvard also included helping to grade papers for Whitehead, who was a recent addition to the faculty (1924). As Whitehead’s assistant, Hartshorne witnessed the Englishman develop “the philosophy of organism” that would find expression in Whitehead’s Gifford Lectures, published as Process and Reality (1929). This book, as well as others written by Whitehead during this period, formed the foundation of twentieth century process philosophy.

Hartshorne’s earliest writings, prior to his encounter with Whitehead, emphasize process and relativity as metaphysically basic; for this reason, he characterized his relation to Whitehead (also to Peirce) as one of pre-established harmony. Just as he would write much on Peirce’s philosophy, so he promoted Whitehead’s importance in thirty-nine articles and reviews; thirteen of these articles are collected in Whitehead’s Philosophy: Selected Essays 1935-1970 (1972). For a time Hartshorne considered himself a Peircean and a Whiteheadian, in each case, as he said,  “with reservations”—in later years he emphasized the reservations. It is clear, in any event, that the exposure to Peirce and Whitehead helped him to focus his thinking. Whitehead’s works, in particular, provided him with a technical vocabulary for expressing his own metaphysics that in some respects overlaps with Whitehead’s but in other respects is very different from it. In the fullness of time, these differences led some Whitehead scholars to complain of an overly Hartshornean slant to Whitehead studies, thus bearing testimony to Hartshorne’s dominance. Hartshorne referred to the years between 1925 and 1958 as his “third period” to highlight the significant influence of Peirce and Whitehead on his thinking.

When Harvard announced that they had “no job” for Hartshorne after his third year of teaching and research, he took a position in 1928 at the University of Chicago, where he was a faculty member in the Department of Philosophy until 1955. He eventually held a joint appointment as a member of the Meadville Theological School (1943-1955). Shortly after the move to Chicago, he married Dorothy Eleanore Cooper (1904-1995), his life-long companion. The Hartshorne’s only child, Emily (Schwartz), was born in 1940. In 1936, he served as secretary (that is, chairperson) of the department of philosophy, during which time Rudolf Carnap was hired. Hartshorne was a visiting faculty at Stanford University in 1937, and he spent the 1941-42 academic year at the New School in New York. From 1948 to 1949 he taught at Goethe University in Frankfurt and also lectured at the Sorbonne in Paris. He was president of the Western Division of the American Philosophical Association in 1949, and he was a Fulbright Lecturer at Melbourne, Australia during 1951-52. Hartshorne was also a member of the informal group of theologians called “the Chicago school,” which included Henry Nelson Wieman, Daniel Day Williams, Bernard Meland, and Bernard Loomer.

At Chicago, Hartshorne’s thinking matured, and he developed the outlines of his own system of speculative philosophy, which he called neoclassical metaphysics. The hiring of Carnap was especially ironic since he was the most famous of the logical positivists  while Hartshorne was one of positivism’s greatest critics. However, Hartshorne reported that, despite his and Carnap’s profound differences in philosophical outlook, their engagement was cordial and fruitful. The German helped him to formalize his objection to the classical understanding of divine foreknowledge in his book The Divine Relativity (1948). Hartshorne published six books while at Chicago (in addition to the Peirce papers), including the wide-ranging survey of philosophical theology titled Philosophers Speak of God (1953/2000), edited with his student William L. Reese. Hartshorne’s other books during this period, apart from his first one, focused on the problems of metaphysics: Beyond Humanism: Essays in the Philosophy of Nature (1937), Man’s Vision of God and the Logic of Theism (1941), and Reality as Social Process: Studies in Metaphysics and Religion (1953).

Hartshorne attracted many graduate students from Chicago’s three federated seminaries, two of whom became well-known theologians (John B. Cobb, Jr., b. 1925, and Schubert Ogden, b. 1928). He was unhappy, however, that few graduate students in philosophy studied with him. Two of the most well-known students in Hartshorne’s classes were Richard Rorty (1932-2007) and Huston Smith (b. 1919). Each became known for defending views at odds with Hartshorne’s ideas : Rorty in philosophy and Smith in religious studies. Even as he disagreed sharply with his former teacher, Rorty made clear that he never ceased to admire Hartshorne’s intellectual passion and generosity of spirit.

Hartshorne and his family left Chicago and moved to Atlanta, Georgia in 1955, where he taught at Emory University until 1962. In 1958, he taught at the University of Washington and visited Kyoto, Japan as a Fulbright Lecturer.  There he learned more about Buddhism, which he called the first process philosophy. It was also in Japan that he began a more intense focus on Anselm of Canterbury’s ontological argument for God’s existence. He would soon publish in the second chapter of The Logic of Perfection (1962), for the first time in the history of philosophy, a formalization of the argument using modal symbolism. Soon afterwards came Anselm’s Discovery (1965), which includes an overview of treatments of the ontological argument in the works of various philosophers and theologians. Hartshorne described this time in his life as the beginning of his “fourth period,” as he gained more critical distance from the philosophies of Peirce and Whitehead and began in earnest to refine his own metaphysical synthesis. Now in his sixties, he faced mandatory retirement at Emory at age 68. In 1962, John Silber, then at the University of Texas at Austin, invited Hartshorne to Texas. Hartshorne accepted the invitation and, in 1963, became Ashbel Smith Professor of Philosophy; he taught full-time until his official retirement in 1978, and part-time for a few years thereafter. During his years at Texas he taught and traveled widely, throughout the United States, including two summer sessions at Colorado College (1977 and 1979), but also to India and Japan on a third Fulbright (1966), Australia (1974), the University of Louvain, Belgium (1978), and again to Japan and Hawaii (1984).

Hartshorne’s productivity in the last three decades of his life was prodigious, beginning with four major works; these included the aforementioned book on Whitehead, the book on bird song, as well as A Natural Theology for Our Time (1967) and Creative Synthesis and Philosophic Method (1970), the latter being his most comprehensive and systematic presentation of neoclassical metaphysics. In his eighties, Hartshorne published dozens of articles, reviews, and forewords, and completed numerous books. Hartshorne gave his most complete assessment of western philosophy in Insights and Oversights of Great Thinkers: An Evaluation of Western Philosophy (1983) and in Creativity in American Philosophy (1984). Omnipotence and Other Theological Mistakes (1984) is a nontechnical introduction to his philosophical theology. The posthumously published Creative Experiencing: A Philosophy of Freedom (2011), completed during the 1980s, complements Wisdom as Moderation: A Philosophy of the Middle Way (1987) and more or less rounds out the technical metaphysical work begun in Creative Synthesis.

The last of Hartshorne’s books to appear during his lifetime, The Zero Fallacy and Other Essays in Neoclassical Philosophy (1997), published in the year of his centenary, was edited by Muhammad Valady, a philosopher he met in 1985. Valady made a thorough study of Hartshorne’s works and engaged him in conversation on a regular basis over lunch. Valady compiled the essays in The Zero Fallacy to reflect the full range of Hartshorne’s thinking, including his empirical work on sensation and on bird song (approximately half the book is comprised of essays not previously published). The book opens with a “brisk dialogue” between Hartshorne and Valady that conveys both the charm of a conversation with the aging philosopher as well as the keenness of his mind in dealing with philosophy. In his twilight years, Hartshorne also contributed to four books devoted exclusively to his thought, giving detailed replies to sixty-two essays by fifty-six scholars (see secondary sources, books edited by Cobb and Gamwell, Kane and Phillips, Sia, and Hahn). His responses fill approximately one fourth of the pages in these volumes. With good reason he expressed concern that philosophers might find it difficult to stay abreast of his writing.

Hartshorne died on Yom Kippur, October 9, 2000 (incorrectly reported as October 10th by The New York Times). He was preceded in death by his wife, who passed away at the age of ninety-one on November 21, 1995.

2. The Affective Continuum and the Psychology of Sensation

Hartshorne began thinking seriously about sensation after an experience he had while serving as an orderly in France during the First World War. As he stood on a cliff looking over a scene of great natural beauty, George Santayana’s phrase “beauty is objectified pleasure” came to him. Hartshorne rejected that slogan on the basis of what he was experiencing. It seemed to him that the pleasure was not experienced in himself as a subject and only then projected onto nature; rather the pleasure was itself given as in the object. He concluded that experience, all experience, is saturated with affect, given in emotional terms. In the essay “Some Causes of My Intellectual Growth” he says, “Nature comes to us as constituted by feelings, not as constituted by mere lifeless, insentient matter.” The point is not that we never attribute more to an object than what the object contains; it is, rather, that objects are never given to us in experience as completely lacking affective tone. Hartshorne never strayed from the conviction that matter devoid of feeling is an abstraction from experience and not a datum of experience.

Hartshorne’s first published book was The Philosophy and Psychology of Sensation (1934), the result of his intense philosophical interest in aesthetic motifs proffered by Peirce and Whitehead and his longstanding interest in empirical psychological inquiries into sensation begun under Troland at Harvard. This interest in empirical inquiries continued with study of some European experimental psychologists such as Julius Pikler, whose name is sometimes paired with Hartshorne’s in the literature on sensation, as in “the Hartshorne-Pikler Hypothesis” discussed  by Lawrence E. Marks in The Unity of the Senses (New York: Academic, 1978). Hartshorne argues for a theory that, in his view, integrates themes of evolutionary biology with experimental and phenomenological data on intersensory analogies, with aesthetic and religious values, and with an overall enhancement of intelligibility or the “unity of knowledge.” The work was written when interest in sensation had dwindled under the influence of American behaviorist theory, when the odd indifference of William James to considerations of sensation was still lingering, and when psychologists were little interested in grand theoretical integrations, including integrations with evolutionary theory. The work, arguably ahead of its time, can be much better appreciated now than when it was first published.

Hartshorne’s theory is organized around the defense of five theses, to be discussed in turn below: (1) the sensory modalities exhibit quantitative continuity, exhibiting no absolute difference of kind; (2) sensory qualia are essentially affective (a theme echoed in the early Heidegger with whom Hartshorne studied); (3) all experience is analyzable as essentially social in the Whiteheadian sense of “feeling of feeling”; (4) sensation is essentially “adaptive” in the evolutionary biological sense; and, (5) sensory qualia have a common origin in evolutionary history.  The whole doctrine might be conveniently labeled as the “affective continuum hypothesis.” The third item is central to the thesis of panexperientialism, which Hartshorne defended throughout his career. In view of its importance to his metaphysics, it will require separate discussion. Here the focus will be on a brief exposition of the other mentioned theses.

First, Hartshorne rejects the “classical” doctrine of Hermann von Helmholtz that the various sensory modalities (visual, olfactory, tactile, gustatory, and auditory experiences) are tightly compartmentalized, allowing no degrees of lesser or greater similarity, and no transition from one modality to another. According to the classical doctrine, while degrees of qualitative similarity or analogy might be permissible within a given sensory modality (for example, dark magenta and royal purple are qualitatively “closer” to one another than are, say, candy red and canary yellow), no inter-modal sensory analogies are permissible such that we could intelligibly say that, for instance, certain odors are more or less similar to certain colors. Moreover, the classical theory of sensation held that sensations are not inherently emotional or affective in character; any affective properties found to be associated with sensations are culturally conditioned “additions” to the sensations; in effect, sensations are essentially pure “registrations” of cognitive data. For classical theory, emotions and sensations are entirely separate functions of consciousness. To the contrary, Hartshorne argues that the classical theory does not fit the phenomenological and empirical evidence, is out of touch with the intersensory analogies provided in all manner of ordinary language metaphors, and does not cohere with the concept of an evolutionary history of sensory systems.

While experimentation on intersensory phenomena is a complex affair and interpretation of some results is disputable, it is fair to say that a body of evidence has emerged which bodes well for the thesis of intersensory connection. Indeed, it is now a commonplace of contemporary psychology texts to discuss evidence for intersensory analogies, for instance, the establishment of connections between visual and auditory neural systems as well as evidence of visual-auditory correlations in the cognitive development of infants. It is also particularly telling that neuroscientists have developed sensory substitution systems that can allow the blind to construct images, objects, and words from tactile stimulation. Moreover, Hartshorne points to abundant metaphors of common parlance which make intersensory connections: some colors are said to be “warm” or “loud,” some sounds are said to be “sweet” or “sour,” some affective states or moods are said to be “blue” or “dark,” or some smells are said to be “delicious” or “distasteful.” The practice of employing intersensory metaphors occurs widely across cultures and is broadly communicative or publically accessible, pointing (at the very least) to the possibility of intersensory continuity and to an underlying objective affect-quality in sensation, thus grounding the communicability of the intersensory metaphors. If the sensory modes are as rigidly separated and analogical connection is as unintelligible as classical theory maintains, it is difficult to explain that language is so saturated with intersensory metaphors. Hartshorne does not deny that there are strong qualitative differences between the qualia of various sensory modes (indeed his theory posits qualitative difference in terms of a geometric notion of “distance on a continuum”), nor does he deny that cultural conditioning can play an important role in constructing affective associations with sensations. Rather, his theory rejects the rigid discontinuity of the sensory modes and the separation of sensation from affectivity.

While Hartshorne is cognizant of cultural conditioning of sensory experience, he argues that such conditioning can be shown to presuppose an underlying affect in the “conditioned” sensation. Consider a locus classicus case of culturally constructed associations of affectivity in classical theory: the preference for white dress in traditional Chinese funerals as opposed to black or dark dress in traditional Spanish or Italian funerals is said to show that there are fundamentally different emotional qualities attached to white in Chinese as opposed to European cultures. Hartshorne argues that this misconstrues the situation. The cultural difference is found in different attitudes toward death and funeral rites, not in different feelings concerning the colors white or black (the Chinese think of funerals as positive celebrations of past life). Hartshorne also applies this reasoning to variations in individual sensory-qualitative preferences. In Creative Synthesis and Philosophic Method, he remarks on how the fact that some persons prefer a certain bitter quality of strong dark chocolate does not show that such individuals “fail to sense the contrast, sweet-bitter, as essentially positive-negative.” It means rather that they do not want mere sweetness or pleasantness; they want a more complex sensory experience. Hartshorne’s point is that an adequate phenomenology of sensation must include the appropriate “layered” complexity of sensory experience and thus accommodate the fact that we have meta-feelings (“feelings about our sensory feelings”) in addition to “object-feelings” (feelings about things that are not feelings, like chocolate).  It is the duality of this, so to speak, “meta-feeling/object-feeling” situation which is the source of the distinction (which Hartshorne calls a “pseudo-duality”) of affect and mere sensation posited by the classical theory of sensation.

In addition, it is not clear how the classical view can be squared with the evolutionary development of sensory modes. If the sensory modes are as separate as classical theory supposes, then how could new sensory modes which evolve have meaningful connections to older modes? Were the transitions from one mode to another simply de novo additions abruptly occurring all-at-once, contrary to standard neo-Darwinian assumptions of gradualism? If one sensory mode evolved from another, then how could it be impossible for the new sensory mode to have analogical connections with its modal parent? How could information from the different sensory modes be coordinated during early moments of evolutionary transition if there is no meaningful analogical connection between them? Would not an organism that possessed the capacity to integrate information from different sensory modes be better adapted to its environment? Hartshorne’s theory, on the other hand, supposes that sensory modes are intrinsically connected by their common evolutionary origins (with tactile capacities as the earliest), that sensation is a form of affectivity that serves the purpose of enhancing the prospects of an organism’s survival, and that this underlying physiological connection of sensation and affectivity is what is primal—it is the “object-feeling” pole of the “meta-feeling/object-feeling” duality found in our complex emotional life.

The affective properties of sensation are most immediately evident in the case of pain; indeed, intense sensations of pain are ineluctably described in strongly affective terms such as “horrific” or “torturous” or “excruciating.” While there may be cases in which, paradoxically, pain is experienced as pleasure, such cases by definition posit a hedonic property to the experience inimical to the notion of a thoroughly “disinterested” pain. The affective aversion that is part and parcel of the experience of pain also clearly coheres with the biological or adaptive value of affectivity that Hartshorne’s theory asserts. Organisms that are not warned of injury by virtue of pain, and that do not seek to avoid such injury by virtue of visceral, emotional aversion to pain, are insofar vulnerable to their environments. Other tactile qualia such as sensual touch are obviously inseparable from hedonic content. Gustatory qualia are also affective as enjoyment of delicious foods and strong aversion to extremely sour or spoiled foods attest. New born infants react with aversion to sour, bitter, or fetid substances, and so it is difficult to “argue away” gustatory affect as culturally conditioned. Here again, there are obvious biological or adaptive advantages for organisms capable of being affectively reinforced by and motivated to seek nutritious foods and avoid fetid substances or spoilage. Sounds, especially in the form of music, are readily seen to evoke emotions in immediate ways. Minor chords, for instance, have an immediate “sad” or “melancholy” tonality which explains their use in ballads evocative of such moods.

Hartshorne understood that the more difficult case for his theory is visual phenomena. For this reason, he discusses at length the affective nature of visual experience with a particular emphasis on color sensation. Careful attention to our experience of color reveals that strong primary colors exhibit affective qualities, as in the paradigm cases that “gaiety” is part and parcel of yellow and “warmth” of red. While Hartshorne admits that there seem to be dull color sensations to which we may seem affectively indifferent, that such sensations possess some slight degree of affect could be shown by imagining blindness with respect to such colors; in addition, such colors have a valuable contextual role to play in providing certain nuances of contrast. In his treatment of Hartshorne’s theory in the Library of Living Philosophers (LLP), psychologist Wayne Viney notes that some previously blind persons who are successfully re-sighted attach much significance even to the visually trivial. Importantly, Hartshorne argues that without such an affective account of color, it is extremely difficult to give a coherent account of the visual arts. If affective qualia are always merely accidently “associated” with color by virtue of idiosyncrasies of personal experience, how could artists communicate or express intelligibly? For instance, the dulled grayish-brownish tones of an Edward Hopper painting convey the depressive atmosphere of life during the Great Depression far better than would the alternative use of bright yellows or Kelly greens or Titian reds. Indeed, certain projects of modern art, such as found in the work of Kandinksy, depend on the notion that color expression can in and by itself evoke emotion without mediation through well-defined objects, whether in surreal juxtaposition or otherwise.

Adaptive values for color sensation are not difficult to conceive. The greater discriminatory information provided by color sensation at least enhances, say, human abilities to demarcate and map out their immediate environments. Moreover, at least one affective property of color can be correlated with experimental neuro-physical evidence; the inherent “aggressiveness” of red correlates with the empirically discerned increase in cortical stimulation when compared to exposure to blue. While this may be explained by cultural conditioning (for example, our learned response to red stop signs), such an explanation may also beg the question as to why red is so often selected as a color of warning. On Hartshorne’s theory, the selection of red occurs precisely because it has the stimulating or aggressive affect it does. In general, Hartshorne sides with Julius Pikler in connecting all affectivity of sensation at its most fundamental level with excitations to act or with behavioral avoidances, and these in turn have an evolutionary “cash value” or utility. Nonetheless, empirical study of the affectivity of color sensation is by no means settled, and results are unclear, for one reason because it is difficult to separate out learned from universal emotional responses to color. Hartshorne’s theory, however, points in the direction of an overall evolutionary account of sensation. Even if Hartshorne has some of the details mishandled, the general thesis of color affect brings color vision in line with other sense modalities and best explains why it was strongly “naturally selected.”

3. Conclusion: Hartshorne’s Work on Sensation and the Rest of his Philosophy

Hartshorne’s first book could be seen, in one respect, as a systematic attack against the form of materialism that finds inspiration in the theory of sense data. From the times of John Locke and David Hume, some empirically minded philosophers and psychologists analyzed experience in terms of “sensory impressions.” Emotions were conceived as annexed onto bare impressions; Hartshorne characterizes this as “the annex view of value.” As already noted, this view of emotion is at odds with evolutionary thinking since a sensation-minus-affect would be lacking in adaptive value. Equally, it is not clearly a deliverance of experience. The analysis of experience into sensory impressions is, Hartshorne held, bad phenomenology; it is an intellectualized reconstruction of experience. The mistake was, in part, due to the excessive attention paid to visual experience, which as we have noted, is where affect is least apparent. Visual experience exhibits less felt relevance of the body than one finds in the other sensory modalities. This may account for the prevalence of visual metaphors for a supposedly immaterial process of intellection. It is easier to forget that one sees with the eyes than it is to forget, for example, that one touches with the skin.

In light of Hartshorne’s conviction concerning the data of experience, it is not difficult to understand why he resonated to the expression “feeling of feeling,” an idea (if not the exact wording) that he found in Chapter X (section II) of Whitehead’s Process and Reality. The clearest instance of a feeling of feeling, for both Whitehead and Hartshorne, is memory, for it is at a minimum the record of a past experience in a present experience. The example of memory also supports Hartshorne’s contention that, while every sensation is a feeling, not every feeling is a sensation. Hartshorne would later refer to the difference between introspection and perception as the difference between personal and impersonal memory.

When Hartshorne came to the business of ontology, he could find nothing more consonant with his psychology of sensation, nothing more in keeping with evolutionary thinking, and nothing more coherent philosophically than panexperientialism, the view that the basic constituents of reality are momentary flashes of experience. Whitehead called these “actual entities” or “actual occasions”; Hartshorne sometimes called them dynamic singulars.  Panexperientialism implies that there must be non-human and non-conscious forms of experience. Leibniz had argued this case before evolutionary theory, but evolution made the case even more convincing. Humans are different from the creatures from which they evolved by matters of degree. Mind-like qualities, Hartshorne argued, are susceptible to an infinitely flexible number of forms. Hartshorne and Whitehead held that every concrete particular is an experient occasion; they did not, however, believe that every whole made of such occasions can be said, as a whole, to feel the world. Whitehead spoke of a tree as a democracy, the cells making up its members—there can be cellular feelings even if the tree as a whole does not feel. Hartshorne used the analogy of a flock of birds: there are feelings in each bird, but the flock itself does not feel.

If Hartshorne followed Whitehead on the ontology of actual occasions, he parted ways with him on how best to construe the nature of possibility. Whitehead took possibility to be grounded in an array of eternal objects, including particular sensory qualities, constituting an ideal world. As is evident in his first book, Hartshorne preferred to think of sensory qualities as existing along an affective continuum. Whitehead, it seems, was not dogmatic in rejecting this view. Hartshorne reports that he presented Whitehead with the following reasoning: if points are constructed from the extensive continuum and not vice versa, as Whitehead held, perhaps, by parity of reasoning, particular sensory qualities are extracted from an affective continuum and not vice versa. According to Hartshorne, Whitehead called the argument “subtle” requiring “further reflection.” It is also worth remarking that Hartshorne’s view is more radically processive than Whitehead’s since it implies that sensory qualities are emergent as the affective continuum is sliced in various ways through the evolutionary process within and between species.

Hartshorne’s theory of the affective continuum is very much in keeping with his aesthetics and with his theory of a monotony threshold in song birds. Hartshorne’s aesthetics locates beauty—which could also be called intense satisfactory experience—as a mean between two extremes: absolute order vs. absolute disorder and ultra complexity vs. ultra triviality. Aesthetic experience, like all sensory experience, must have, on Hartshorne’s account, both a subjective and an objective side. In a word, Hartshorne denies that the quality of beauty is “merely in the eye of the beholder,” or to generalize, “merely in the perception of the perceiver.” Hartshorne’s study of bird song convinced him that oscines have a primitive aesthetic sense. He found evidence that birds with more varied repertoires have shorter pauses between their songs than do birds with less varied repertoires. In a word, simpler repertoires invoke more boredom whereas varied repertoires are more interesting—hence, a “monotony threshold.” Hartshorne meant his theory to supplement, not to replace, standard accounts of bird song as the marking of territory. His view of the aesthetics of bird song coheres nicely with his evolutionary view of sensation and affective tone.

Hartshorne’s emphases on the primacy of feeling in perception and of aesthetic experience are also evident in his form of theism. God, he held, has the eminent form of “feeling of the feelings” of others. In the first instance, this means that God’s knowledge is suffused with affect and is not simply an intellectual awareness of the world, for example, a knowing of the truth value of propositions. According to Hartshorne, divine cognition is a form of what William James called “knowledge of acquaintance” rather than simply a “knowledge-about.” This idea yields a view of omniscience that is decidedly more intimate than one that is couched in terms of the metaphor of an “all-seeing” deity. Since, for Hartshorne, the relation of “feeling of feelings” has a temporal structure, every instance of awareness in the present must be nothing other than an awareness of the past. It stands to reason that, if God is the eminent embodiment of “feeling of feelings,” God must also have the eminent form of memory. This is indeed Hartshorne’s view, which he calls “contributionism.” Every experience of a non-divine being is felt and retained in perfect memory by God, thereby contributing to the richness of the divine immortal life. In Hartshorne’s words, God’s possession of us, not our possession of God, is our final achievement.

4. References and Further Reading

a. Primary Sources

i. Life

  • Hartshorne, Charles. 1970. “The Development of My Philosophy.” Contemporary American Philosophy: Second Series, ed. John E. Smith. London: Allen & Unwin, 1970: 211-28.
  • Hartshorne, Charles. 1970. “Charles Hartshorne’s Recollections of Editing the Peirce Papers.” Transactions of the Charles S. Peirce Society 6, 3-4: 149-59.
  • Hartshorne, Charles. 1973. “Pensées sur ma vie”: 26-32; “Thoughts on my Life”: 60-66. Bilingual Journal, Lecomte du Noüy Association, 5 (Fall)
  • Hartshorne, Charles. 1984. “How I Got that Way.” Existence and Actuality: Conversations with Charles Hartshorne. John B. Jr. and Franklin L Gamwell, eds. Chicago: University of Chicago Press: ix-xvii.
  • Hartshorne, Charles. 1990. The Darkness and the Light: A Philosopher Reflects Upon His Fortunate Career and Those Who Made it Possible. Albany: State University of New York Press.
  • Hartshorne, Charles. 1991. “Some Causes of My Intellectual Growth.” The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. Lewis Edwin Hahn, ed. La Salle, Illinois: Open Court: 3-45.

ii. Psychology of Sensation

  • Hartshorne, Charles.1927. Review of A.N. Whitehead. Symbolism, Its Meaning and Effect (New York: Macmillan, 1927). Hound and Horn 1: 148-52.
  • Hartshorne, Charles. 1931. “Sense Quality and Feeling Tone.” Proceedings of the Seventh International Congress of Philosophy. Gilbert Ryle, ed. London: Oxford UP: 168-72.
  • Hartshorne, Charles. 1934. The Philosophy and Psychology of Sensation. University of Chicago Press. Republished in 1968 by Kennikat Press.
  • Hartshorne, Charles. 1934. “The Intelligibility of Sensations.” The Monist 44, 2: 161-85.
  • Hartshorne, Charles. 1961. “Professor Hall on Perception.” Philosophy and Phenomenological Research 21, 4: 563-71.
  • Hartshorne, Charles. 1963. “Sensation in Psychology and Philosophy.” Southern Journal of Philosophy 1, 2: 3-14.
  • Hartshorne, Charles. 1965. “The Social Theory of Feelings.” Southern Journal of Philosophy 3, 2: 87-93. Reprinted in Persons, Privacy, and Feeling: Essays in the Philosophy of Mind, ed. Dwight Van de Vate, Jr. Memphis: Memphis State UP, 1970: 39-51.
  • Hartshorne, Charles. 1967. “Psychology and the Unity of Knowledge.” Southern Journal of Philosophy 5, 2: 81-90.
  • Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1973. Born to Sing: An Interpretation and World Survey of Bird Song. Bloomington, Indiana University of Indiana Press.
  • Hartshorne, Charles. 1984. “Response to George Wolf.” Existence and Actuality: Conversations with Charles Hartshorne. John B. Jr. and Franklin L Gamwell, eds. Chicago: University of Chicago Press: 184-188.
  • Hartshorne, Charles. 2001. Notes on A. N. Whitehead’s Harvard Lectures 1925-26, transcribed by Roland Faber. Process Studies 30/2: 301-373.

b. Secondary Sources

i. Life

  • Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press: 1-14.
  • Viney, Donald Wayne. 2003. “Charles Hartshorne.” American Philosophers Before 1950. In Dictionary of Literary Biography, volume 270, edited by Philip B. Dematteis and Leemon B. McHenry. Detroit: Thomson Gale, 2003: 129-51.
  • Viney, Donald Wayne. 2004. “Charles Hartshorne.” Dictionary of Unitarian Universalist Biography, 1999-2004. On-line at: http://www.uua.org/uuhs/duub/articles/ charleshartshorne.html
  • Viney, Donald Wayne. 2005. “Hartshorne, Charles (1897-2000)” The Dictionary of Modern American Philosophers, edited by John R. Shook (London: Thoemmes Press): 1056-62.
  • Viney, Donald Wayne. 2008. “Charles Hartshorne (1897-2000),” Handbook of Whiteheadian Process Thought, Volume 2, edited by Michel Weber and Will Desmond. (Frankfurt / Paris / Lancaster: Ontos Verlag): 589-596.

ii. Psychology of Sensation

  • Anon. 1985. Report on Hartshorne’s “My Enthusiastic but Partial Agreement with Whitehead,” presented at the eleventh Congreso Ineramericasno de Filosifia, Guadalajara, Mexico, Nov. 15, 1985, Center for Process Studies Newsletter, 9, 4, 7.
  • Dombrowski, Daniel. 2004. Divine Beauty: The Aesthetics of Charles Hartshorne. Nashville, Tennessee: Vanderbilt University Press.
  • Hospers, John. 1991. “Hartshorne’s Aesthetics.” The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. Lewis Edwin Hahn, ed. La Salle, Illinois: Open Court: 113-134.
  • Viney, Wayne. 1991. “Charles Hartshorne’s Philosophy and Psychology of Sensation.” The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. Lewis Edwin Hahn, ed. La Salle, Illinois: Open Court: 91-112.

c. Bibliography

“Primary Bibliography of Philosophical Works of Charles Hartshorne” (compiled by Dorothy Hartshorne; corrected, revised, and updated by Donald Wayne Viney and Randy Ramal) in Herbert F. Vetter, editor, Hartshorne: A New World View: Essays by Charles Hartshorne (Cambridge, Massachusetts: Harvard Square Library, 2007): 129-160. Also published in Santiago Sia, Religion, Reason and God (Frankfurt am Main: Peter Lang, 2004): 195-223.

 

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Charles Hartshorne: Theistic and Anti-Theistic Arguments

HartshorneCharles Hartshorne is well known in philosophical circles for his rehabilitation of Anselm’s ontological argument. Indeed, he may have written more on that subject than any other philosopher. He considered it to be the argument that, more than any other, reveals the logical status of theism. Nevertheless, he always clearly and explicitly denied that the argument was his reason for being a theist. There are two reasons for this. First, he believed that, without a revision in the very concept of deity, Anselm’s argument could readily be turned upside down, so to speak, so as to constitute not a proof of theism but its disproof. Consequently, Hartshorne believed that a full defense of theism requires developing a coherent concept of God. (See “Charles Hartshorne: Dipolar Theism.”) Second, Hartshorne’s revised ontological argument does not stand alone. It is one strand in a fabric of reasoning which he sometimes called “the global argument.” He followed C. S. Peirce’s recommendation that philosophy should rely on a variety of interrelated pieces of evidence rather than trust to the conclusiveness of a single argument. Peirce (5.265) used the analogy of a cable, the strength of which is in the combination of its numerous fibers. Peirce specifically mentioned that this way of arguing is typical of science, but it is also evident in other areas such as law, history, and literary criticism. Nowadays, philosophers use Basil Mitchell’s terminology and call the multiple argument strategy a “cumulative case.” Hartshorne’s most systematic presentation of the global argument is in the fourteenth chapter of Creative Synthesis and Philosophic Method, titled “Six Theistic Proofs.” Not long after this essay appeared, he stopped calling the arguments proofs, for he recognized that it is often the case that equally rational and informed philosophers disagree on fundamental issues. For this reason, he presented the global argument in a way that emphasizes both the rational basis of neoclassical theism and the rational cost of rejecting it. In addition to discussing Hartshorne’s case for theism, this article also addresses Hartshorne’s reflections on the problem of evil.

Table of Contents

  1. Anselm’s Discovery and the Ontological Argument
  2. The Global Argument
  3. The Problem of Evil and Theodicy
  4. Conclusion
  5. References and Further Reading
    1. Primary Sources
      1. Books in Order of Publication Date
      2. Hartshorne’s Response to his Critics
      3. Selected Articles
    2. Secondary Sources
    3. Bibliography

1. Anselm’s Discovery and the Ontological Argument

It used to be customary to speak in the singular of “Anselm’s ontological argument.” Hartshorne was the first to argue that this is mistaken. Setting aside the question of Anselm’s intentions, Hartshorne found that two arguments are suggested in Anselm’s Proslogion, one in chapter II, another in chapter III. Hartshorne made this point in 1944 in an article published in The Philosophical Review and again in 1953 in Philosophers Speak of God. The philosophical world did not take notice until 1960 when Norman Malcolm’s article, “Anselm’s Ontological Arguments,” made the distinction between the two arguments famous. Hartshorne, like Malcolm, agreed with Anselm’s critics that the first argument (in chapter II) is fallacious, but the second argument (in chapter III), which has a modal structure, he considered valid. The difficulty in showing that the argument is sound kept Hartshorne from thinking of it as demonstrating God’s existence. In The Logic of Perfection, Hartshorne presented a formalized version of the argument using C. I. Lewis’s system S5, the first such formalization to be published. In Anselm’s Discovery, he again defended a version of the argument and canvassed the various treatments of Anselm’s reasoning in the history of philosophy, including an anticipation of the argument in Plato noted by the scholar Prescott Johnson.

In the introduction to George L. Goodwin’s The Ontological Argument of Charles Hartshorne, and again in Creative Experiencing, Hartshorne reduced the modal ontological argument to what he considered to be its essentials. The argument’s logical symbols are the tilde (~) for negation, the arrow (→) for strict implication, M for “is logically possible” (thus, “~M~” means “is logically necessary”), and p* stands for “God exists,” where God is defined as “a being unsurpassable by any other conceivable being.” (In Hartshorne’s dipolar theism, the divine can, in some senses, surpass itself but it is unsurpassable by any other being). The argument is presented as follows:

  1. Mp*
  2. Mp* → ~M~p*
  3. Therefore, ~M~p*

If necessity (~M~) is what is common to all possibilities—a common definition—and if any state of affairs that is actual is also possible—a standard modal principle—then the conclusion to be drawn is that God exists (p*). Hartshorne was under no illusions that this mode of reasoning would convince the skeptic that God exists. Nor did he use it as his reason for believing in God. Nevertheless, the argument is not, in the hyperbole of Graham Oppy (199), “completely worthless.” In A Natural Theology for Our Time, Hartshorne credited George Mavrodes with the insight that it is unreasonable to suppose that no doubts about theism can be removed because an argument cannot remove all doubts about theism. Moreover, the simple deductive structure of the argument clarifies what is at stake in the theistic question. If one denies the conclusion, one must deny one or more of the premises or what their denials entail. Hartshorne follows Gottfried Wilhelm Leibniz in urging that, in questions of metaphysics, philosophers are more apt to err in what they deny than in what they affirm. Highlighting the rational cost of rejecting theism can, for this reason, be a fruitful method in metaphysics.

If one rejects the conclusion of Hartshorne’s modal argument, one of two alternatives is possible. First, it may be that God’s existence is impossible (~Mp*), which is the denial of the first premise. This is the view that J. N. Findlay originally took in his famous 1948 article, “Can God’s Existence Be Disproved?” In effect, Findlay’s argument turns Hartshorne’s modal modus ponens upside down to make a modal modus tollens disproof: If M~p* and Mp* → ~M~p*, it follows that ~Mp*. Hartshorne referred to this as the a priori atheist or positivist position. The second alternative is that a logical consequence of the second premise is false. The strict implication of the second premise allows one to infer that if God’s existence is logically possible then it is logically necessary. If this is false, then God’s existence and non-existence are equally possible: Mp* and M~p*. This was the view of David Hume, for whom every proposition asserting or denying existence, including “God exists,” is logically contingent. Hartshorne calls this the empiricist position, or sometimes empirical theism or empirical atheism depending on whether or not the empiricist thinks that God exists.

Hartshorne considered the empiricist position regarding the ontological argument as the least tenable. The second premise says, colloquially, if God is so much as logically possible, then it must be the case that God exists. Hartshorne calls this “Anselm’s principle,” or more forcefully, “Anselm’s discovery.” The discovery is that God, as unsurpassable, cannot exist with the possibility of not existing. Put differently, contingency of existence is incompatible with deity. Anselm’s formula that God is “that than which nothing greater can be conceived” means, among other things, that any abstract characteristic for which something greater can be conceived cannot properly be attributed to deity. For example, if there is something greater than being partially ignorant, then God cannot be conceived as partially ignorant. Or again, if there is something greater than interacting with some but not all others, then God cannot be conceived as a merely localized being. Applied to modality of existence, Anselm’s principle means that a deity that can fail to exist is not the greatest conceivable. If correct it is then a mistake to conceive of God as possibly existing and possibly not existing. This is another way to state the second premise. One may deduce from this premise that it is impossible that God’s existence and non-existence are both logically possible. Symbolic notation presents this as ~M(Mp* and M~p*).

Hartshorne emphasized that the empiricist’s view he considers Anselm to have refuted is shared by, among others, those who consider the existence of God as a hypothesis to be established or refuted by science. Hartshorne accepts Karl Popper’s idea that empirical statements must be falsifiable by some conceivable experience (see “Charles Hartshorne: Neoclassical Metaphysics”). Anselm’s principle entails that if God exists, there could be no disconfirming empirical evidence of God’s existence. On the other hand, if God does not exist, then by parity of reasoning, there could be no confirming empirical evidence of God’s existence. If premise two is correct, the remaining options are that God exists necessarily (~M~p*) or God’s existence is impossible (~Mp*). This removes the question of God’s existence from the domain of science. Yet, this is not the same as removing the question from rational justification, unless metaphysics is impossible, a position that Hartshorne vigorously opposed. In effect, treating the existence of God as a scientific hypothesis is a failure to conceive of God as unsurpassable by any being other than God—and is therefore a changing of the subject.

Among the many criticisms of Hartshorne’s reasoning about the ontological argument, four stand out as deserving special treatment: one from J. N. Findlay, one from John Hick, one stemming from W. V. O. Quine’s reflections on modal logic, and one from H. G. Hubbeling. Each is set forth in The Philosophy of Charles Hartshorne. Hartshorne praises Findlay for most clearly stating the objection that the concrete cannot be deduced from the abstract, and that this is what the ontological argument purports to do. Definitions are abstract, but God’s existence must be concrete; from the logically weak definition of God one may not deduce the logically stronger conclusion that God exists. Put somewhat differently, if the deduction succeeds, then God’s existence must be as abstract as God’s essence. Hartshorne’s response to Findlay is to accept the principle but to appeal to the distinction between existence and “actuality”––Hartshorne’s term for existence in a particular, determinate, concrete state (see “Charles Hartshorne: Dipolar Theism, section 2”). To be sure, the ontological argument concludes to the existence of God, which is abstract, but more explicitly, it concludes to God’s existence as somehow actualized. No actual state of God—which is the concreteness of God—can be deduced by a metaphysical argument. The structure of this reasoning is analogous to Hartshorne’s argument that non-being is impossible (see “Charles Hartshorne: Neoclassical Metaphysics”). The statement “Something exists” may be necessarily true, as Hartshorne urges, although it gives no information as to what actually exists. It only says that the set of existing things is not empty. By parity of reasoning, the conclusion of Hartshorne’s modal argument can be rephrased to say that the set of actual divine states is never empty. With good reason, Hartshorne insisted that he knew very little about God. At most, his metaphysics yields only the most abstract truths about deity, although he stressed that it is a notable achievement to advance the subject of metaphysics when so few attend to its reasoning.

Hick pressed the objection that Hartshorne’s ontological argument confuses two kinds of necessity; one pertaining to propositions (logical necessity), the other pertaining to a being (ontological necessity). According to Hick, to say that God exists of necessity is to say no more than that God has the property of “aseity.” That is, God’s existence, unlike all creaturely existence, depends upon nothing outside of itself. This does not mean, Hick claims, that “God exists” is a necessary truth. To speak of God’s existence as logically necessary is, in Hick’s view, a category mistake; applying to a being a predicate that is properly a predicate of sentences. Hartshorne agrees with Hick that, excluding the case of God, all propositions asserting the existence or non-existence of an individual are logically contingent. However, in all of these cases, there is a causal explanation for the possibility of the individual’s existence which neatly explains why the proposition asserting or denying existence is not necessarily true. For example, the non-existence of x’s monozygotic twin is explained by the fact that the fertilized egg from which x came did not split; x’s existence also has a causal explanation in the union of a particular sperm and egg. Hartshorne notes that there is no analogous explanation, on Hick’s empiricist account, for why “God exists” is logically contingent. Yet, Hartshorne has a ready explanation for why the proposition is not logically contingent, an explanation moreover that Hick uses in explaining the meaning of divine necessity: neither God’s existence nor non-existence could have a causal explanation. In both The Logic of Perfection and Creative Experiencing, Hartshorne discusses other characteristics of logically contingent propositions that “God exists” lacks. For example, God’s existence includes all positive forms of existence whereas the existence of any creature within the universe excludes certain positive states of affairs. Hartshorne says that God’s existence is not competitive. Hartshorne’s conclusion is that, on Hick’s account, “God exists” breaks the usual semantic criteria for a proposition to count as logically contingent.

Hartshorne’s response to Hick is that the meanings of modal terms must be anchored in the causal-temporal matrix. If this is true, then only particular noun-adjective combinations are logically conceivable. Numerous parodies of the modal argument—beginning with Gaunilo’s “perfect island”—consist in joining the concept of necessary existence to real or imagined localized beings. On Hartshorne’s account these ideas are improperly conceived, for they cannot withstand the application of semantic criteria that distinguish contingent and necessary truths. Attaching necessary existence to a being that is properly conceived as contingent is the reverse of the error of attaching contingent existence to a being that is properly conceived as necessary. Hartshorne counts both extremes as errors. It is no accident that it was J. S. Mill, an empiricist, who made famous the question, “Who made God?” If “God” signifies a being unsurpassable by all others, then asking for the cause of God’s existence is on a par with asking what is north of the North Pole. Both questions are grammatical, but both are also nonsensical. Of course, on Hick’s account of divine aseity, it is also a mistake to ask for the cause of God’s existence. However, Hartshorne’s theory of the semantic grounding of modal terms in temporal process provides one reason why it is a mistake.

Another important objection to Hartshorne’s modal ontological argument, especially as presented in The Logic of Perfection, arises from Quine’s attack on the intelligibility of de re modality. While Hick criticized Hartshorne’s modal argument for moving illicitly from de dicto (linguistic) to de re (ontological) conceptions of modality, Quine’s strategy is to reject the very intelligibility of de re modality. If successful, such a critique would surely devastate the modal version of the argument since, for Hartshorne, “logical modality mirrors objective modality.”

Quine’s challenge to the intelligibility of de re modality has been taken up in great detail by Goodwin in his book The Ontological Argument of Charles Hartshorne. In his foreword of the work, Hartshorne endorses Goodwin’s approach. The arguments could be summarized as follows. Quine objected to the idea of de re modality, since it involves quantification across modal operators. For example, the formulation “(Ǝx) (necessarily, x is greater than seven)” is logically illicit, claims Quine, because the modal operator “necessarily” is inserted within a quantifier-bound variable-predicate expression. Quine points out that we cannot generalize existentially from the legitimate de dicto formulation:

Necessarily, nine is greater than seven.

to the illicit

(Ǝx) (necessarily, x is greater than seven).

This is because “nine” in (a) is referentially opaque; it fails to denote in a singular way, and thus opens the door to counter-examples in the generalized sentence (b). For instance, Quine says that “nine” can name “the number of planets,” but it is not a property of “the number of planets” that it is necessarily greater than seven. Given his theory of contingent states of affairs, Hartshorne would not object to the notion that “the number of planets,” presumably in our solar system, is indeed a contingency. The thrust of this is that, because of referential opacity in quantified modal logic, we do not know what it means to introduce propositions of the existentially generalized form (b). However, Goodwin notes that Hartshorne is indeed committed in his modal version of the argument to such forms as:

(Ǝx) (necessarily, x is perfect).

Consequently, an effective Hartshornean response to Quine’s critique requires an intelligible semantics for modal logic.

Goodwin argues that Saul Kripke supplies such a semantics in the essay, “Semantical Considerations on Modal Logic.” According to Kripke, we can give an intelligible account of sentences involving quantification into modal contexts. A sentence having the form of (b) can be interpreted to say: “there is an object, x, in this world which has the property “greater than seven,” and x has this property in every possible world in which x exists.” In other words, x exists in this world and at least some possible worlds accessible from this world, and x falls under the extension of the predicate “greater than seven” in every world in which it exists.” However, this only takes one so far in the provision of an (arguably) intelligible formal semantics for sentences involving quantification into modal contexts.

Quine replies that the very terms of this formal semantical solution to the problem of opacity raises the further question of what it means for an individual or object to exist in various possible worlds. This problem has come to be known as the problem of “trans-world identity.” Quine challenges any response to his critique of de re modality based on Kripke’s semantics by arguing that Kripke’s solution to referential opacity ushers in a semantics involving the difficulty of “essential properties.” For instance, let the value for x be C. S. Peirce, while the predicate attributed to x is “being a speculative philosopher.” Must Peirce be a speculative philosopher in any possible world in which he exists in order to be Peirce in such worlds? Could Peirce be a “seventeenth century sea captain” in some possible worlds and still intelligibly remain Peirce in such possible worlds?

It is precisely here, argues Goodwin, that Hartshorne’s ontology of temporal process can be employed, providing Kripke with intelligible criteria for making trans-world identifications. The problem of trans-world identity seems perplexing and insolvable when assuming, to use Quine’s phrase, “Aristotelian essentialism,” in which essential properties belong to substances that make no inherent reference to temporality. By contrast, Hartshorne’s process or event ontology positions the search for an intelligible criterion for trans-world identity in the much wider matrix of successive and causally efficacious temporal units of becoming. This is one reason why Hartshorne prefers to speak of “possible world-states” rather than “possible worlds” (see “Charles Hartshorne: Neoclassical Metaphysics”). Temporal inheritance becomes the essential factor in determining identity, and thus more readily settles the above questions: Peirce might well exist as, say, a professional painter in some possible world-state, since he might have been one in the history of this actual world; that is, since there may have been a juncture in Peirce’s development in which he was not particularly taken with questions of speculative philosophy, but was exposed to an environment of intense interest in artistic expression. Yet, surely he could not be, in any possible world-state, a seventeenth century sea captain, since this would bear nothing in common with his succession of temporal events. To conclude the issue cautiously, perhaps we should say that, even if Hartshorne’s event-ontological criterion of temporal inheritance does not fully resolve the issue of trans-world identity, it seems to simplify it profoundly. More pointedly, this criterion directly answers Quine’s charge of the unintelligibility of solutions based on Aristotelian essentialism that appeal to temporally de-contextualized substances.

A technically sophisticated objection to Hartshorne’s modal argument, especially as expressed in The Logic of Perfection, comes from H. G. Hubbeling. He presents Hartshorne with a dilemma: the modal argument is valid if and only if the theory of temporal modalities is false. The problem is that Hartshorne’s argument is expressed in Lewis’s S5 system in which modal status is necessary. Symbolically (where L = ~M~), it is presented as such: “If Lp* then LLp*” and “If Mp* then LMp*.” Temporal modalities, however, are best expressed in Lewis’s weaker S4 system, which includes the first of these formulae as an axiom but the second formula is neither an axiom nor a theorem. Without “If Mp* then LMp*” Hartshorne’s argument is not valid, for then it could be the case that God’s existence is possible but not necessarily so. On the other hand, Hartshorne wants to ground the meaning of modal terms in temporal process. The most plausible semantics for S5, however, leaves modal concepts untethered to time.

It is to be noted, however, that Hartshorne gave other versions, both informal and formal (such as the version used above) which do not depend on S5. Hartshorne was convinced that an element of intuitive judgment that goes beyond the logical formalism is involved in assessing the argument. However, granting the element of intuitive judgment does not directly answer Hubbeling’s dilemma. What remains is whether Hubbeling’s challenge can be met from within Hartshorne’s form of dipolar theism. It seems true that S5 is the appropriate modal system for expressing the abstract point of the argument relating to God’s unique characteristic of existence in every possible state of affairs. S5’s property of complete “world accessibility” symmetry is exactly what is needed. On the other hand, S4 is applicable to the description of what Hartshorne calls God’s actuality or God’s concrete states. So, Hartshorne’s distinction between existence and actuality maps onto the S5/S4 distinction. (For more on the existence/actuality distinction, see “Charles Hartshorne: Dipolar Theism,” section 2).

2. The Global Argument

If Hartshorne is correct, the ontological argument reveals the logical status of the theistic question as metaphysical rather than empirical. The argument falls short of a proof of theism, in large measure, because it depends on the premise that the existence of God is logically possible. Hartshorne’s own arguments against classical theism show that this is not an acceptable premise. Hartshorne once commented that John Duns Scotus also concluded that the question of God’s existence is not empirical. Hartshorne added, “My quarrel with him is that I regard his form of theism as either self-inconsistent or meaningless” (Viney 1985, x). Hartshorne believed that the weak premise in the modal argument is the first one, that “God” names a possible reality. He said in his reply to Hick that all of his misgivings about believing in God rested on the suspicion, which is difficult to remove, that every form of theism masks an absurdity. At least in part, this explains Hartshorne’s efforts to defend metaphysics as both the search for necessary truths about existence and the development of a coherent dipolar theism. One can think of the global argument as the completion of this process. Discounting the modal argument, each element of Hartshorne’s cumulative case is designed to buttress the claim that the existence of God is logically possible.

The various strands of the global argument highlight what Hartshorne considered to be the theistic implications of neoclassical metaphysics. Each argument is given a familiar name that suggests precursors in the history of philosophy, but none of them has an exact equivalent in the world’s philosophical literature. In addition to the ontological argument, Hartshorne develops his own versions of the cosmological, teleological, epistemic, moral, and aesthetic arguments. In keeping with Hartshorne’s use of position matrixes, each argument is presented as a logically exhaustive set of options. We have already hinted at this style of reasoning in the modal argument where one has the choice that God’s existence is a necessity (~M~p*), an impossibility (~Mp), or a contingency (Mp* and M~p*). Other strands of the global argument are also presented in this way: to affirm one alternative is to deny all others, and alternately, to deny one is to affirm that only one of the others is true. In each case, Hartshorne employs what he calls “the principle of least paradox” to conclude that the rational cost of rejecting neoclassical theism is greater than the cost of accepting it. Time and again, Hartshorne acknowledged the difficulties of an unqualified verdict in favor of neoclassical theism, but he also believed that his view better answered the questions of metaphysics than his rivals. Hartshorne was epistemically cautious in recognizing that his method would not yield a decisive victory for his own views. As with the modal argument, Hartshorne believed that no degree of logical rigor can eliminate the need for an element of intuitive judgment. The “essential element in rational procedure in metaphysics” is to honestly face the logically possible alternatives and to weigh up the cost of accepting or rejecting them (Viney 1985, x).

Much of the global argument is anticipated in Hartshorne’s explanation and defense of neoclassical metaphysics (see “Charles Hartshorne: Neoclassical Metaphysics”). Consider an outline of Hartshorne’s cosmological argument. As noted above, Hartshorne argues that “Something exists” is necessarily true. The principle of contrast and Hartshorne’s defense of de re modality, if correct, imply that what exists is characterized by both contingency and necessity. The necessary, moreover, as the common element in all possibility, is abstract. If it is possible for this necessity to be divine—more precisely, the abstract pole of the divine—then it is possible for God to exist. This supports the weak premise of the modal argument that God’s existence is logically possible. To reject the conclusion, one must either deny the necessity of existence, the principle of contrast, de re modalities, the character of necessity as abstract, or the possibility that the necessary aspect of things is divine. Hartshorne’s cosmological argument differs from traditional versions in neither concluding to the existence of a prime mover, an uncaused cause, nor a wholly necessary being. Of course, none of these descriptions fits the dipolar God, and Hartshorne had no interest in defending them.

Causal principles enter Hartshorne’s cumulative case in his argument from cosmic order, which he calls “the design argument.” Hartshorne defends a metaphysic according to which the cosmos is a theater of interactions among dynamic singulars, all of which act and are acted upon. The existence of many real beings, thus defined, raises the problem of cosmic order. The question is not why there is order rather than mere chaos. For Hartshorne, chaos presupposes order as much as non-existence presupposes existence—indeed, mere chaos is indistinguishable from nonbeing. The question, rather, is how there can be order on a cosmic scale if there is only an uncoordinated set of centers of creative activity. Localized order, or order within the cosmos, can be explained by localized activity of entities within the cosmos. The order of the cosmos, however, cannot be the outcome of a coordinated effort by the many entities since their very existence, severally scattered throughout the cosmos, presupposes the cosmos as a field of activity. If there is a cosmic-ordering power that itself falls under the metaphysical principle of acting and being acted upon, then cosmic order can be explained. Moreover, as Hartshorne argues in A Natural Theology for Our Time, the explanation is not ad hoc since all real beings, localized ones and the cosmic-ordering power, fall under the same metaphysical principle. The cosmic-ordering power is not, in the words of Alfred North Whitehead, an exception to metaphysical principles, invoked to save their collapse, but is their chief exemplification.

Hartshorne allows that the expression “cosmic order” permits different values; the laws of nature must include constants as well as variables, and the values of the constants (for example, the speed of light), are not logical necessities. In this way, one may speak, with Whitehead, of different “cosmic epochs” in which the laws of nature beyond the singularities of our universe are not identical with our own. Hartshorne insists, however, that the problem of cosmic order remains. This is because our conceptions of the fundamental laws of nature are contingent and mathematically peculiar in character. For instance, an epoch such as our own with a law of gravitation specified by “mass times mass proportioned to the radius squared” is a particular nomological condition to be conceptually contrasted with, say, gravitation as “mass times mass proportioned to the radius cubed.” Basic laws of nature appear to have the logical earmarks of “contingent decrees,” and as such it is legitimate to ask for their causal explanations. Thought experiments which assert that such basic laws could be instituted by chance mechanisms beg the question of basic order. An example is Hume’s suggestion of an epicurean universe of swerving atoms that happen to arrange themselves into the cosmic “regularities” we observe. As Hartshorne says in A Natural Theology for Our Time, talk of atoms with a definite character persisting through time is “already a tremendous order.” Recent thought experiments in cosmology such as “bubble inflation” models also seem to posit background assumptions of contingent cosmic conditions, including the operating laws of quantum mechanics which necessarily involve specific quantitative values (for example, as in the use of Planck’s Constant).

On Hartshorne’s neoclassical theistic alternative, one arguably need not settle for any metaphysically inexplicable contingent cosmic order or a freedom-suppressing “necessitarian” universe. It is also well to remember that Hartshorne vigorously defends “indeterminism.” If determinism is false, then neither the order within the cosmos nor the order of the cosmos is absolute. Multiple real beings with varying degrees of creative power are a recipe for conflict. To be sure, the existence of multiple real beings also opens the possibility for cooperative endeavors, whether it is cooperation among or between localized beings and the cosmic designer; but multiple creativity guarantees a mixture of disharmony and harmony. The cosmic-ordering power can guarantee a cosmic order, but because of the existence of a plurality of real beings that act, and are not simply acted upon, not everything that happens can be chosen by a single individual, even a divine one. This is relevant to the problem of theodicy, for it shows that, in neoclassical metaphysics, the conflict of decisions among the creatures and between the creatures and God are possible, opening the way to tragedies that not even God can avoid.

A skeptic may embrace any of the options that Hartshorne denies, but at a cost. Hartshorne argues that each of the non-theistic options has dubious metaphysical credentials and that his solution to the problem of cosmic order is the most parsimonious. If there is no cosmic order one must explain the apparent success of science in discovering that order. If there is no cosmic-ordering power then either localized beings are being used to explain an order that their activity presupposes or there is no explanation of the order. Another atheistic option is to accept that there is a cosmic-ordering power but deny that it is divine. Hartshorne considered “panentheism” to provide a superior analogy to anything atheism can propose for the cosmic designer. However, the remaining three strands of the global argument can also be used to support the idea of such an ordering power; it is not only an agent causally affecting the world but is also affected by the world and incorporates it into the divine life, as one that perfectly knows the world (epistemic argument), perfectly preserves its achievements (moral argument), and fully appreciates the world (aesthetic argument).

In the epistemic argument, Hartshorne raises the question of the relation between reality and knowledge. In one respect, knowledge depends upon the real, for one cannot know what is not real. On the other hand, it is difficult to give an account of the real apart from some form of knowledge. As Hartshorne (Creative Synthesis 288) notes, Immanuel Kant suggested that appearance differs from reality because “ … the content of our sensory intuition differs from the content of a non-sensory intuition” (See also Kant’s Critique of Pure Reason, A249, A252). The object of the non-sensory intuition is the “noumenon.” (Hartshorne parts company with Kant in conceiving God’s knowledge as partly passive rather than as wholly active). Taking up Kant’s point, no merely partial or fallible knowing can circumscribe the real, for the extent of errors in knowing are measured by the real—if one is mistaken about x then something about x escapes one’s knowledge. In view of these conundrums, it is tempting to say that reality is the potential content of infallible knowledge—what an epistemically unsurpassable being would know if it existed. The problem with this solution, as far as atheism is concerned, is that an infallible knower, by definition, could not possibly be mistaken. However, it would know its own existence, so one is led to posit not simply the possible existence of an infallible knower, but also its actual existence. Hartshorne drew precisely this conclusion, that reality is the actual content of infallible knowledge. He argued further, following Josiah Royce, that defects in cognitive experience are internal to experience. Hartshorne mentions confusion, inconsistency, doubt, inconstancy of beliefs, and “above all, a lack of concepts adequate to interpret our percepts and of percepts adequate to distinguish between false and true concepts” (Creative Synthesis 288).

A distinctive feature of Hartshorne’s account of perfect knowledge is that it requires both cognitive and affective components (see “Charles Hartshorne: Dipolar Theism,” part 5). God must be conceived not only as knowing all true propositions but also as knowing the creatures themselves; that is, feeling what they feel. Whatever one has been and however one has felt become transformed thereafter as an everlasting memory in God’s consciousness. This applies also to the collective life of the creatures. There is no mere numerical sum of value in God—as if value were simply a question of set membership—for the experiences of creatures become woven into the fabric of God’s undying experience. This is what Hartshorne means by “contributionism,” that the creatures enrich the divine life in a way that would not have been possible apart from their activity. In comments he made on a debate about the resurrection of Jesus, Hartshorne (Did Jesus Rise From the Dead? 140) asked, “If people can live or die for country, or other human groups, why can they not live and die for that which embraces all groups and their intrinsic values—the divine life?” Hartshorne was fond of quoting the Jewish prayer, “Help us to become co-workers with You, and endow our fleeting days with abiding worth.” The moral argument brings out the attractiveness of this ideal as the supreme aim of creaturely existence.

There are a number of ways to reject contributionism. One may deny that there is any supreme aim, theistic or nontheistic. Hartshorne argues that this robs comparative value judgments of a standard of comparison; if, as most reflective people would accept, it is possible to squander one’s life on trivial, unimportant, or immoral pursuits, then there must be a measure of the good life that is being used as a comparison. Another option is that self-interest is the supreme aim. Hartshorne follows the Buddhists in rejecting this (see “Charles Hartshorne: Neoclassical Metaphysics”). More plausible is the idea that the aim of life is to live for self and for others either during this life or in an afterlife. Hartshorne considered this laudable, but finally unsatisfactory as the supreme aim of life. First, he argued that there is at best a numerical meaning of “general welfare,” whereas neoclassical theism provides an experiential meaning in God’s experience. Second, there is the problem of mortality. In “A Free Man’s Worship,” Bertrand Russell stated the problem clearly when he proposed to build a philosophy of life upon a foundation of “unyielding despair.” The despair stems from the recognition that “the noonday brightness of human genius” and “the whole temple of man’s achievement” is destined to perish. There is, to be sure, apparent nobility in such Sisyphian labor, except that “nobility” and “tragedy” become, on this account, as if they had never been. Dipolar theism, on the other hand, accounts for the value of past achievement as an enduring aspect of the unending process of God’s life and memory. Moreover, the value of living for self and others is included in Hartshorne’s account, for the supreme “other” is God. The extent and nature of value that one contributes to God is precisely the extent and quality of value that one has contributed to others. Hartshorne argued that contributionism captures the inclusive nature of love that one finds expressed in biblical ethics: one cannot love God if one does not love others, and one is to love God with everything one is and to love one’s neighbor as oneself.

An argument from the beauty of the world as a de facto whole rounds out Hartshorne’s cumulative case and ties it to the aesthetic motif of his philosophy. It is quite natural, and prima facie rational, to speak of enjoying the beauty of the cosmos. Most people consider it appropriate to include aesthetic predicates in descriptions of the universe, for it is endlessly interesting, mysterious, and awe inspiring. Hartshorne described science as the search for the hidden beauty of the world, and many great scientists would agree; even those who have little or no use for philosophy or religion, like Steven Weinberg who states that the universe is beautiful beyond what seems necessary. An aesthetically displeasing universe, says Hartshorne, would be either chaotic or monotonous. What we find, on the contrary, is order in the laws of nature and variety in the evolution of new arrangements of matter and levels of mind. Hartshorne speaks of the world as a de facto whole, for he means to stress its open-ended and dynamic character. If atheism is true, then it is non-divine individuals alone that enjoy the beauty of the universe as a whole, catching a glimpse of it in the slice of time that is available to them and to the species. The peek that we have of the beauty of the cosmos, moreover, reveals horizons suggestive of aesthetic riches forever beyond our grasp. Hartshorne argues that this would represent an irremediable aesthetic defect in the universe, for beauty should be enjoyed and only God could adequately enjoy the beauty of the world as a whole. Of course, what should be is not necessarily what is. Hartshorne insists, however, that unlike merely contingent defects, the lack of a divine spectator would be a necessary defect, “an eternally necessary yet ugly aspect of things” (Creative Synthesis 290). It is a thought without intrinsic reward or pragmatic value, best conceived as a thought experiment whose purpose is to make us realize a divine mind that can appreciate the beauty that escapes us.

The conclusions of the design and epistemic arguments, together with Hartshorne’s “psychicalism,” lend support to his aesthetic argument. As the supreme cosmic-ordering power, whose knowledge is the ultimate measure of reality, the divine, in any particular state of its life, must find within itself the entire wealth of all creative experiencing that has ever existed. This experience of a universe in process is, as Whitehead says, “beyond our imagination to conceive”; it includes (to us) the imperceptible abyss of the past as well as the infinite possibilities of the future. It is here that these lines of inference dovetail with the moral argument. God must be conceived not only as the supreme spectator appreciating the beauty of the world as a de facto whole, but also as the supremely beautiful (or sublime) object of contemplation, adoration, and worship—an endlessly unfolding cosmic experience to which we contribute. Also implicit in Hartshorne’s theology is that God is, as it were, the supreme actor in the play of existence. The various roles of the deity, as Hartshorne conceives it, are neatly summarized in the title of one of his articles: “God as Composer-Director, Enjoyer, and, in a Sense, Player of the Cosmic Drama.”

3. The Problem of Evil and Theodicy

As long as there have been theists there has been a problem of evil, whether as a believer’s lament (as in Job), as a theologian’s conundrum (as in Augustine), or as a skeptic’s argument (as in Hume). Contemporary philosophers of religion speak of two forms of the problem of evil: the logical and the evidential. The logical problem of evil raises the question whether the existence of evil, conceived as gratuitous suffering, is logically consistent with the existence of a God that is perfect in power, knowledge, and goodness. The evidential problem of evil raises the question whether its existence renders improbable that of a perfect God. Hartshorne found neither version of the problem especially troublesome for his form of theism. He held that the problem with both versions of the problem of evil, as they are usually stated, is that they pose a loaded question, presupposing a concept of divine power that, in Hartshorne’s (Philosophical Aspects of Thanatology 86) words, “is not even coherent enough to be false.” Hartshorne developed and defended a metaphysic of shared creativity in which no individual, not even a divine one, can have a monopoly of power (see “Charles Hartshorne: Neoclassical Metaphysics” and “Dipolar Theism”). He was fond of disagreeing with Einstein who said that God does not play dice. On the contrary, chance and multiple freedom are inseparable; it is no accident, said Hartshorne (Studies in the Philosophy of J. N. Findlay 230), that there are accidents. Although God has the eminent form of creative power, it is not enough to guarantee a world without accidents, wrongdoing, and tragedy. Hartshorne would say that the evidential problem of evil suffers from the additional defect of assuming that God’s existence is an empirical question. We have seen that, according to Hartshorne, this represents a failure to appreciate the logical consequences of “Anselm’s discovery.”

Much of the appeal of traditional religion is that it offers the hope that the gulf between what is and what ought to be can be bridged in a future existence. It promises that the cosmic scales of justice are finally balanced either through the mysterious operations of karma in the process of reincarnation or through the omnipotence of God in a heavenly or hellish afterlife. Hartshorne considered these to be false hopes. While he did not definitively reject the possibility of an afterlife, he showed no interest in speculating about it or defending the idea. He argued that it is the divine prerogative alone to persist through infinite variations; the self-identity (that is, the genetic identity) of a non-divine individual cannot sustain itself indefinitely. Even if there were an afterlife, there could be no guarantee that the individual would survive long enough for every injustice, or even the greatest of injustices, in that person’s life to be rectified. Moreover, an afterlife could not eliminate the risk inherent in multiple or shared creativity. Traditional accounts of the afterlife are plausible only to the extent that creaturely freedom bends to a higher moral law (karma) or will (God’s) imposed on it. The heavens, hells, and purgatories of religion are elaborately orchestrated so as to place all lesser freedoms in perfect harmony with justice. In Hartshorne’s neoclassical metaphysics—especially evident in his design argument—God has the power to insure order on a cosmic scale, a power that is tantamount to insuring a field of activity for localized individuals. Divine power does not, however, extend to insuring what decisions the creatures will make. No particular outcome can be guaranteed.

To grant that the two versions of the problem of evil do not undermine neoclassical metaphysics, still leaves the question of God’s role regarding suffering and injustice. The facts that generate the problem of evil do not go away because one successfully rebuts a philosophical argument. Hartshorne claims that his theology makes better sense of “God is love” than its competitors, yet, there is a great deal of suffering that is undeserved, pointless, and widespread. Evolutionary theory adds another dimension. Entire ecosystems and countless species have come and gone in the course of geologic time. Throughout this history, creatures compete for the goods that will insure their survival and very often live at each other’s expense. Nature seems entirely indifferent to comparative values; as John B. Cobb Jr. noted, “lower” species thrive at the expense of “higher” species as when malarial mosquitoes feed on human beings. Finally, there are what Marilyn Adams calls “horrendous evils,” evils that are so pernicious that they give reason to doubt that the person’s life could be a great good to him or her on the whole. Hartshorne claims that a loving God is a necessary and indispensible character in this drama. One may ask whether this is plausible, but one must also take care not to permit the presuppositions of classical theism to color one’s judgment. Hartshorne counsels to be suspicious of the question whether our world is the sort that one would expect from an almighty and all-loving creator. In the context of dipolar theism the question must be rephrased: Is this the sort of world that one would expect of a deity that is perfect in power and love that presides over a world composed of beings, each of which exercises some degree of creativity?

If Hartshorne is correct, God accounts for order on a cosmic scale. There must be, however, two aspects to this activity that are distinguishable but not separable. On the one hand, there is the ordering activity that establishes the cosmic order per se, making possible all non-divine forms of freedom. On the other hand, there is the ordering activity that lures each localized being towards greater intensity of experience. Hartshorne holds that both aspects of God’s creative ordering of the world follow aesthetic principles (see “Charles Hartshorne: Neoclassical Metaphysics”). According to these principles, the double extremes between which the divine ordering power operates are (1) unqualified unity and unqualified diversity (or chaos) and (2) ultra-complexity and ultra-simplicity (or triviality). The mere fact of an ordered cosmos does not automatically avoid the aesthetic defects of being overly chaotic or trivial. Avoidance of these extremes requires a cumulative developmental process, which is implicit in Hartshorne’s cumulative view of process. In neoclassical metaphysics, “the explanation for the contingent must be a genetic one,” as Hartshorne (82) says in Insights and Oversights of Great Thinkers. It could not be everlastingly true that there have been elephants or seahorses. Because the process is cumulative, it must also be developmental. For example, an elephant is not created de novo from a mixture of atoms and molecules; it requires a lengthy process of species development. This is why Hartshorne claimed in Omnipotence and Other Theological Mistakes that the general idea of evolution is derivable from his metaphysical principles.

God’s role in the economy of nature is not simply maintaining cosmic order, but also eliciting higher forms of order, making possible forms of experience with greater levels of unity in diversity. A law of axiology as firm as any law of nature is that varying levels of creative experience are necessarily correlated with varying levels of what can be achieved in the way of value. For example, as complex and emotionally rich as a dog’s interior life may be, it is not sufficient to produce scientific theorizing or high artistic accomplishment. What follows is that varying levels of creativity exhibit varying levels of opportunity and risk. For instance, one cannot be ironic with a dog. Irony may amuse or offend only if one’s audience can understand it. As goes creative experiencing, so goes freedom. The cost of actual or possible achievement is the risk of failure. This analysis is evident in the few comments that Hartshorne made about sin. In a 1944 symposium on world peace, Hartshorne said that much could be learned from Reinhold Niebuhr that sin is not a struggle between “lower” (bodily) and “higher” (spiritual) aspects of personality. Rather, sin is a perversion of what is highest in a person, one’s sense of the divine; it is the claim to be divine, “a rebellion against our humble station in the universe” (Finkelstein and Maciever 597). This idolatry comes in many forms, religious and nonreligious, in the pernicious claims to infallibility or any attempt to place ultimate worth in something less than deity. As far as our experience goes, these are the highest and most tragic manifestations of the general principle that greater degrees of freedom necessarily accompany greater possibilities of its abuse.

Hartshorne agrees that the world is better to the extent that sin, and the suffering it brings in its wake, is not part of it. It does not follow, however, that the world is better to the extent that the possibility of sin is excluded from it. The conditions for the possibility of good or evil are the same: freedom. Indeed, Hartshorne maintained that some degree of evil is inevitable if good is to be possible. It is true that the particular evils that occur are not inevitable. Knowing this, we imagine that the cosmos could be altogether free of the blemish of evil, but this is to imagine an ideal that no single individual could bring about. One might agree with this but ask, with Hartshorne, whether there is a greater possibility of evil than might be expected from an all-loving cosmic designer. In The Zero Fallacy, Hartshorne spoke of human beings as the “bullies of the planet,” heedless of the welfare of other creatures, cruel to our own kind, and too often lacking the will to prevent such cruelty. He asked whether the seemingly unbridgeable distances between the earth and other solar systems might be a providential arrangement. In Omnipotence and Other Theological Mistakes, he allows himself an expression of doubt as to whether the “perilous experiment” of creatures free of instinctive guidance was too dangerous. He says that if he played at criticizing God, it would be at this point. Yet, Hartshorne also accepted on faith the infallible wisdom and ideal power of God. In Wisdom as Moderation, Hartshorne denies that limited intellects are in a position to know whether there is too much risk of evil in the world, for such a judgment must include a potentially infinite future. He also stressed that the justification of the world is in the world; that is, in the open-ended adventure of life itself that God’s creativity insures.

One of Hartshorne’s definitions of religion is the acceptance of our fragmentariness. We are fragmentary both in the sense that we are limited in space and time (that is, we are localized) and in the sense that our capacities for knowledge and goodness are limited (that is, we are imperfect). If something like Hartshorne’s panentheism is correct, we are also fragmentary in the sense that we are part of the divine being-in-becoming (see “Charles Hartshorne: Dipolar Theism”). For Hartshorne, God includes all but does not determine all, much like a person includes the cells of his or her body without being able to decide the details of their activity. Thus, what we do makes a difference in and to God in the sense that we can enhance or diminish in admittedly limited ways the divine enjoyment of the world—hence, the concept of tragedy in God mentioned previously. We have also seen, in the moral argument, that Hartshorne regarded the aim of consciously contributing to the divine life as the highest purpose to which we can aspire. In Wisdom as Moderation, he says, “God’s possession of us is our final achievement, not our possession of God” (90). Every creature that has ever existed or will ever exist becomes part of the inexhaustible memory of God. In Plato’s Symposium, Socrates, reporting the views of Diotima, speaks of immortality as the achievement of doing acts worthy of future generations’ remembrance. Hartshorne offers a similar kind of immortality except that the fallible and mortal memory of future generations is replaced by the infallible and unending memory of God.

A Hartshornean theodicy does not allow one to say that everything, or every evil, happens for a reason. There is no cure for the fact that the “lower” sometimes lives at the expense of the “higher” and that horrendous evils are part of this universe. On the other hand, a Hartshornean theodicy allows one to say that anything that happens, or any evil that occurs, can become part of a reason for striving to overcome evil with good thereby depriving evil of its capacity to dishearten us. The true depth of divine power, on Hartshorne’s view, is not God’s ability to manipulate events to the best possible outcome, but to be able to bear the suffering of the creatures without being overcome by it. On Hartshorne’s view, God is forever seeking ways to bring good from the world no matter how bad things may get. The world-weariness that sometimes overcomes the creatures never overcomes deity. In the language of William James, Hartshorne’s God is neither a pessimist (thinking that things cannot get better) nor an optimist (thinking that things are for the best), but a kind of cosmic meliorist (thinking that things can get better). This theology may console in at least two ways. To those who are helpless and who suffer, Hartshorne claims that there is a divine co-sufferer. To those who are not helpless and who work for the welfare of others, Hartshorne maintains that they are indeed working on the side of the cosmos itself, as co-workers with God. This is what Pierre Teilhard de Chardin called “building the earth.” In this way, Hartshorne’s theism may promote a resilient spirit in the face of defeat, hope that may conquer despair, and love that holds the promise of harnessing evil.

4. Conclusion

Hartshorne’s extensive writings on the ontological argument were instrumental in generating new interest in Anselm’s reasoning and in redoubling the efforts of philosophers in exploring and evaluating the variations that it can take. By highlighting a second form of ontological argument—a modal version—that the vast majority of philosophers had ignored, Hartshorne demonstrated that it was no longer sufficient to rely on Gaunilo or Kant for a refutation of Anselm. Hartshorne benefited from the formalizations of modal systems made popular by his teacher C. I. Lewis, and was the first to publish a formalized version of the modal argument. This unprecedented accomplishment clarified the argument and helped turn attention to its modal structure.

One could argue that Hartshorne was a victim of his own success. As many philosophers had failed to read Anselm closely enough to discern a second argument in his Proslogion, so philosophers had a tendency not to read Hartshorne closely enough to understand that he never used the modal argument as a singular proof of theism. Hartshorne used the argument as a single strand in a cumulative or global argument for neoclassical theism. His way of presenting the elements of the global argument emphasized the rational cost of rejecting the premises that, in each case, Hartshorne argued, was greater than in accepting the conclusion. To be sure, Hartshorne considered the modal argument an essential strand in the case for theism since it reveals, he believed, the logic of theism. If Hartshorne is correct, empirical arguments for or against the existence of God are unavailing because they misconstrue the nature of the theistic question. This idea also extends to skeptical arguments from evil that conclude to either the non-existence or probable non-existence of God. The problems of theodicy, for Hartshorne, concern the presence of evil in a universe in which every concrete particular has some degree of creativity, and not, as in traditional theology, where creativity is the unique privilege of God.

5. References and Further Reading

a. Primary Sources

i. Books in Order of Publication Date

  • Hartshorne, Charles. 1941. Man’s Vision of God and the Logic of Theism. Chicago: Willett, Clark and Company.
  • Hartshorne, Charles. 1948. The Divine Relativity: A Social Conception of God. New Haven, Connecticut: Yale University Press.
  • Hartshorne, Charles and William L. Reese, eds. 1953. Philosophers Speak of God. Chicago: University of Chicago Press. Republished in 2000 by Humanity Books.
  • Hartshorne, Charles. 1962. The Logic of Perfection and Other Essays in Neoclassical Metaphysics. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1965. Anselm’s Discovery: A Re-examination of the Ontological Proof for God’s Existence. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1967. A Natural Theology for Our Time. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1983. Insights and Oversights of Great Thinkers: An Evaluation of Western   Philosophy. Albany: State University of New York Press.
  • Hartshorne, Charles. 1997. The Zero Fallacy and Other Essays in Neoclassical Philosophy. Ed.    Mohammad Valady. Peru, Illinois: Open Court Publishing Company.
  • Hartshorne, Charles. 2011. Creative Experiencing: A Philosophy of Freedom. Ed. Donald W.       Viney and Jincheol O. Albany: State University of New York Press.

ii. Hartshorne’s Response to his Critics

  • Alston, William. 1964. “Interrogations of Charles Hartshorne.” Philosophical Interrogations. Eds. Sydney Rome and Beatrice Rome. New York: Holt, Rinehart and Winston: 319-54.
  • Cobb, John B. Jr. and Franklin L Gamwell, eds. 1984. Existence and Actuality: Conversations with Charles Hartshorne. Chicago: University of Chicago Press.
  • Hahn, Lewis Edwin, editor. 1991. The Philosophy of Charles Hartshorne, The Library of Living Philosophers, Volume XX. La Salle, Illinois: Open Court.
  • Kane, Robert and Stephen H. Phillips, eds. 1989. Hartshorne, Process Philosophy and Theology.  Albany: State University of New York Press.
  • Sia, Santiago, ed. 1990. Charles Hartshorne’s Concept of God: Philosophical and Theological      Responses. Dordrecht, the Netherlands: Kluwer Academic Publishers.

iii. Selected Articles

  • Hartshorne, Charles. 1944. “The Formal Validity and the Real Significance of the Ontological Argument.” The Philosophical Review 53.3: 225-45.
  • Hartshorne, Charles. 1945. “On Hartshorne’s Formulation of the Ontological Argument: A Rejoinder [to Elton].” Philosophical Review 54.1: 63-5.
  • Hartshorne, Charles. 1961. “The Logic of the Ontological Argument.” Journal of Philosophy 58.17: 471-73.
  • Hartshorne, Charles. 1962. Introduction. Saint Anselm: Basic Writings. 2nd ed. Trans. S. W. Deane. La Salle, Illinois: Open Court Publishing Company: 1-19.
  • Hartshorne, Charles. 1963. “Rationale of the Ontological Proof.” Theology Today 20.2: 278-83.
  • Hartshorne, Charles. 1966. “Is the Denial of Existence Ever Contradictory?” Journal of Philosophy 63.4: 85-93.
  • Hartshorne, Charles. 1967. “Necessity.” Review of Metaphysics 21.2: 290-96.
  • Hartshorne, Charles. 1967. “Rejoinder to Purtill.” Review of Metaphysics 21.2: 308-09.
  • Hartshorne, Charles. 1972. “Can There Be Proofs for the Existence of God?” Religious Language and Knowledge. Eds. Robert H. Ayers and William T. Blackstone. Athens: University of Georgia Press: 62-75.
  • Hartshorne, Charles. 1977. “John Hick on Logical and Ontological Necessity.” Religious Studies 13.2: l55-65.
  • Hartshorne, Charles. 1978. “A Philosophy of Death” Philosophical Aspects of Thanatology. Vol. 2. Eds. Florence M. Hetzler and A. H. Kutscher. New York: MSS Information Corp.: 81- 89.
  • Hartshorne, Charles. 1982. “Grounds for Believing in God’s Existence.” Meaning, Truth, and God. Ed. Leroy S. Rouner. London: University of Notre Dame Press: 17-33.
  • Hartshorne, Charles. 1984. “God and the Meaning of Life.” On Nature. Vol. 6. Ed. Leroy S. Rouner. Notre Dame, Indiana: University of Notre Dame Press: 154-68.
  • Hartshorne, Charles. 1985a. “Theistic Proofs and Disproofs: The Findlay Paradox.” Studies in the Philosophy of J. N. Findlay. Eds. Robert S. Cohen, Richard M. Martin, and Merold Westphal. Albany: State University of New York Press: 224-34.
  • Hartshorne, Charles. 1985b. “Our Knowledge of God.” Knowing Religiously. Vol. 7. Ed. Leroy S. Rouner. Notre Dame, Indiana: University of Notre Dame Press: 52-63.
  • Hartshorne, Charles. 1987. “Response to resurrection debate.” Did Jesus Rise From the Dead? The Resurrection Debate. Ed. Terry L. Miethe. San Francisco: Harper & Row: 137-42.
  • Hartshorne, Charles. 1989. “Metaphysical and Empirical Aspects of the Idea of God.” Witness     and Existence: Essays in Honor of Schubert M. Ogden. Eds. Philip E. Devenish and George L. Goodwin. Chicago: University of Chicago Press: 177-189.
  • Hartshorne, Charles. 1999. “Can We Understand God?” Framing a Vision of the World: Essays in Philosophy, Science and Religion. Eds. André Cloots and Santiago Sia. Belgium: Leuven University Press: 87-97.

b. Secondary Sources

  • Boyd, Gregory A. 1992. Trinity and Process: A Critical Evaluation and Reconstruction of Hartshorne’s Di-Polar Theism Towards a Trinitarian Metaphysics. New York: Peter Lang.
  • Burrell, David B. 1982. “Does Process Theology Rest on a Mistake?” Theological Studies 43.1: 125-35.
  • Clarke, Bowman. 1971. “Modal Disproofs and Proofs for God.” Southern Journal of Philosophy 9.3: 247-58.
  • Dombrowski, Daniel A. 1996. Analytic Theism, Hartshorne, and the Concept of God. Albany: State University of New York Press.
  • Dombrowski, Daniel A. 2006. Rethinking the Ontological Argument: A Neoclassical Theistic Response. New York: Cambridge University Press.
  • Finkelstein, Louis and Robert M. Maciever, eds. 1944. Approaches to World Peace: A       Symposium. New York: Conference on Science, Philosophy, and Religion in their Relation to the Democratic Way of Life.
  • Goodwin, George L. 1978. The Ontological Argument of Charles Hartshorne. Missoula:   Montana Scholars Press.
  • Goodwin, George L. 1983. “The Ontological Argument in Neoclassical Context: Reply to Friedman.” Erkenntnis 20: 219-32.
  • Goodwin, George L. 2003. “De Re Modality and the Ontological Argument.” Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Ed. George W. Shields. Albany: State University of New York Press: 175-97.
  • Kane, Robert. 1984. “The Modal Ontological Argument.” Mind 93: 336-50.
  • Lucas, Billy Joe. 2003. “The Second Epistemic Way.” Process and Analysis: Whitehead,   Hartshorne, and the Analytic Tradition. Ed. George W. Shields. Albany: State University of New York Press: 199-207.
  • Neville, Robert C. 1980. Creativity and God: A Challenge to Process Theology. New York: The Seabury Press.
  • Neville, Robert C. 2009. Realism in Religion: A Pragmatist’s Perspective. Albany: State University of New York Press.
  • Oppy, Graham. 1995. Ontological Arguments and Belief in God. New York: Cambridge University Press.
  • Peirce, C. S. 1934. The Collected Papers of Charles Sanders Peirce. Vol. 5, Eds. Charles Hartshorne and Paul Weiss. Cambridge: Harvard University Press.
  • Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press.
  • Peters, Eugene H. 1984. “Charles Hartshorne and the Ontological Argument.” Process Studies 14.1: 11-20.
  • Shields, George W. 1980. “Review of The Ontological Argument of Charles Hartshorne by George L. Goodwin.” The Journal of Religion 60.3: 357-59.
  • Shields, George W. 1980. “Hartshorne’s Modal Ontological Argument.” Dialogue 22.1-2: 45-56.
  • Shields, George W. 1983. “God, Modality and Incoherence.” Encounter 44.1: 27-39.
  • Shields, George W. 1992. “Hartshorne and Creel on Impassibility,” Process Studies 21.1: 44-59.
  • Shields, George W. 1992. “Infinitesimals and Hartshorne’s Set-Theoretic Platonism” The Modern Schoolman 49.2: 123-134.
  • Shields, George W., ed. 2003. Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Albany: State University of New York Press.
  • Sia, Santiago. 1985. God in Process Thought: A Study in Charles Hartshorne’s Concept of God. Dordrecht, the Netherlands: Martinus Nijhoff.
  • Sia, Santiago. 2004. Religion, Reason and God: Essays in the Philosophy of Charles Hartshorne   and A. N. Whitehead. Frankfurt am Main: Peter Lang.
  • Sia, Santiago, ed. 1986. Word and Spirit, a Monastic Review, 8: Process Theology and the Christian Doctrine of God. Petersham, Massachusetts: St. Bede’s Publications.
  • Viney, Donald Wayne. 1985. Charles Hartshorne and the Existence of God. Albany: State University of New York Press.
  • Viney, Donald Wayne. 1986. “How to Argue for God’s Existence: Reflections on Hartshorne’s   Global Argument.” The Midwest Quarterly 28.1: 36-49.
  • Viney, Donald Wayne. 1987. “In Defense of the Global Argument: A Reply to Professor Luft.” Process Studies 16.4: 309-311.
  • Viney, Donald Wayne. 2005. “A Lamp to Our Doubts: Ferré, Hartshorne, and Theistic     Arguments.” Nature, Truth, and Value: Exploring the Thinking of Frederick Ferré. Eds. George Allan and Merle F. Allshouse. Lanham, Maryland: Lexington Books: 255-69.
  • Whitney, Barry L. 1985. Evil and the Process God. Toronto: Edwin Mellon Press.
  • Wilcox, John T. 1961. “A Question from Physics for Certain Theists.” Journal of Religion 40.4:    293-300.
  • Wood, Forest Jr. and Michael DeArmey, eds. 1986. Hartshorne’s Neo-Classical Theology. New   Orleans: Tulane University Press.

c. Bibliography

  • Viney, Donald Wayne and Randy Ramal. 2007. “Primary Bibliography of Philosophical Works of Charles Hartshorne.” Hartshorne: A New World View: Essays by Charles Hartshorne. Ed. in Herbert F. Vetter. Cambridge, Massachusetts: Harvard Square Library: 129-160. Also published in Sia, Santiago. 2004. Religion, Reason and God. Frankfurt am Main: Peter Lang: 195-223.

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Reformed Epistemology

Reformed epistemology is a thesis about the rationality of religious belief. A central claim made by the reformed epistemologist is that religious belief can be rational without any appeal to evidence or argument. There are, broadly speaking, two ways that reformed epistemologists support this claim. The first is to argue that there is no way to successfully formulate the charge that religious belief is in some way epistemically defective if it is lacking support by evidence or argument. The second way is to offer a description of what it means for a belief to be rational, and to suggest ways that religious beliefs might in fact be meeting these requirements. This has led reformed epistemologists to explore topics such as when a belief-forming mechanism confers warrant, the rationality of engaging in belief forming practices, and when we have an epistemic duty to revise our beliefs. As such, reformed epistemology offers an alternative to evidentialism (the view that religious belief must be supported by evidence in order to be rational) and fideism (the view that religious belief is not rational, but that we have non-epistemic reasons for believing).

Reformed epistemology was first clearly articulated in a collection of papers called Faith and Rationality edited by Alvin Plantinga and Nicholas Wolterstorff in 1983. However, the view owes a debt to many other thinkers.

Table of Contents

  1. Introduction
  2. The Origins of Reformed Epistemology
    1. Reformed
    2. Epistemology
  3. Key Figures in Reformed Epistemology
    1. William Alston
    2. Alvin Plantinga
    3. Nicolas Wolterstorff
  4. Evidence and Rational Belief in God
  5. Classical Foundationalism
    1. Rejecting Classical Foundationalism
  6. The Positive Case in Reformed Epistemology
    1. The Christian Mystical Practice
    2. The Parity Argument
    3. Warranted Christian Belief
  7. Objections to Reformed Epistemology
    1. Great Pumpkin Objection
    2. Disanalogies
    3. Religious Diversity
      1. Religious Belief is Epistemically Arbitrary
      2. Competing Belief Forming Practices
    4. Sensible Evidentialism
  8. References and Further Reading

1. Introduction

Here is an argument against the rationality of belief in God:

(1) Belief in God requires the right kind of evidence in order to be rational.

(2) No such evidence exists for belief in God.

(3) Therefore, belief in God is not rational.

The idea here is that in order for belief in God to be rational, there needs to be an appropriate relationship between belief and evidence. What is appropriate, according to those who endorse the above argument, is that the belief in question be based on good evidence. This argument is sometimes referred to as the evidentialist objection to believe in God. According to the reformed epistemologist, philosophers have historically taken premise 1 to be rather intuitive. As a result, discussion involving the rationality of belief in God focused almost entirely on premise 2. Thus, philosophers who defended the rationality and justification of belief in God would have done so by responding to premise 2 and providing evidence for God’s existence. The evidentialist objection fails, they claim, because sufficient evidence does exist for rational belief in God. According to the reformed epistemologist, then, theists (historically anyway) who reject premise 2 would simply endorse the following argument:

 (1) Belief in God requires the right kind of evidence in order to be rational.

(2*) Such evidence does exist for rational belief in God.

(3*) Therefore, belief in God is rational.

For the theist who defends this argument, finding the right kind of evidence that is sufficient for rational belief in God becomes their chief aim. The problem, according to the reformed epistemologist, is that such a move is unnecessary. There is, in other words, a much easier way around the evidentialist objection—the rejection of premise 1. Thus, for the reformed epistemologist the problem with the evidentialist objection lies not with 2, but with 1. Why assume that belief in God is in any way subject to the demands of 1? Belief in God, argues the reformed epistemologist, can be rational without inference from evidence or argument. If this central claim is true, 1 is undermined and the evidentialist objection (as it stands) fails.

2. The Origins of Reformed Epistemology

Reformed epistemology first appeared in the early 1980s but the view owes a debt to many other thinkers. The influences on reformed epistemology can be divided into two groups: reformed influences and influences from within epistemology.

a. Reformed

Reformed epistemology was first clearly articulated in a collection of papers called Faith and Rationality edited by Alvin Plantinga and Nicholas Wolterstorff in 1983. The reason for “reformed” in reformed epistemology is a result of the clear influences from the reformed theological tradition on this view. Two of the leading proponents—Plantinga and Wolterstorff—taught at Calvin College and they take inspiration from important reformed thinkers such as John Calvin and Abraham Kuyper.

The most explicit appeal to the reformed tradition is found in Alvin Plantinga’s work. Plantinga, when wondering how theistic belief might be grounded, suggests that we consider that Calvin may have been right when he said that God has created humans with an inner awareness of himself and it is this sensus divinitatis that is responsible for theistic belief. Plantinga also engages with and criticizes reformed thinkers who reject natural theology such as Karl Barth (See Plantinga 1983).

Despite the important role that reformed thought has played in the early days of reformed epistemology, and, in particular, in the thinking of some of its key proponents, the central tenets of reformed epistemology do not depend on this tradition. Plantinga has tried to make this more explicit. In Warrant and Christian Belief he argues that the ideas he finds in Calvin are also found in Thomas Aquinas. In fact, there is no reason to believe that there won’t be numerous traditions within Christian thought that could also adopt something like the view defended by reformed epistemologists. Furthermore, the view could be easily adapted by other religions—particularly monotheistic religions.

In light of this, the word “reformed” in reformed epistemology is best thought of as describing the inspiration behind the position rather than its core claims. Objections to reformed thought, or to Christianity more generally, may leave reformed epistemology unscathed.

b. Epistemology

As well as being influenced by the reformed tradition, reformed epistemology draws on work in epistemology. The philosopher who has most clearly been influential to reformed epistemologists is Thomas Reid, a Scottish Presbyterian minister. Reid’s epistemology is distinctive because of the importance he places on describing the belief forming faculties that give rise to our beliefs. These faculties are dispositions to form certain beliefs in response to being triggered in certain ways. These dispositions can vary over time and we can gain some and lose others through training or habit. But some of our belief dispositions are innate—we are simply born with them. According to Reid these innate dispositions cannot ultimately be rationally grounded by us, but we must rely on them nonetheless.

This Reidian picture of epistemology has had a significant influence on reformed epistemology. Accordingly, reformed epistemologists argue that in order to understand whether or not our religious beliefs are rational we must consider what sorts of being we are and the innate belief dispositions that we have.

3. Key Figures in Reformed Epistemology

Though perhaps not a sufficient condition, the rejection of premise 1 above is at least a necessary condition when it comes to identifying key figures within reformed epistemology. Below, then, we discuss three philosophers who reject the idea that belief in God is rational only when inferred from good evidence. These philosophers—William Alston, Alvin Plantinga, and Nicholas Wolterstorff—are key figures within religious epistemology and were central in the development of reformed epistemology.

a. William Alston

William Alston’s first major contribution to reformed epistemology comes in a pair of essays “Religious Experience and Religious Belief” and “Christian Experience and Christian Belief” (the latter of these appears in Faith and Rationality, which is edited by Alvin Plantinga and Nicholas Wolterstorff). His aim is to argue that Christian Practice (CP) is justified. CP is the practice of forming certain kinds of beliefs in response to certain experiences. The sorts of beliefs in question are those such as “God will provide for his people” or “God will forgive the sins of the truly repentant.” They are beliefs about God and his activities and Alston calls these beliefs “M-beliefs” where M stands for manifestation (Alston 1983: 104-105).

Alston wishes to show that those who engage in CP are justified in much the same way that we are justified in engaging in a different practice—perceptual practice (PP). PP is the very familiar practice of forming certain perceptual beliefs in response to perceptual experiences.

Alston argues that there is no non-circular justification available for PP; this is because our only access to the physical world, that PP gives us knowledge of, is through PP itself. The only justification we have for PP is that we do not have sufficient reason for believing that it is unreliable. CP, claims Alston, is justified by the same standard. Those who claim that we need some independent reason for trusting CP are holding it to a higher epistemic standard than PP.

Alston went on to offer a book-length defense of these ideas in Perceiving God. In Perceiving God Alston spends significant time discussing objections to what he is now calling Christian Mystical Practice (CMP). He concludes that all the objections fail and that they are guilty of one of two things: epistemic imperialism or double standards. He describes epistemic imperialism as requiring that CMP be like PP in some way, if it is to be justified, without any epistemic support for that requirement. Objections are guilty of double standards when they seek to apply a standard to CMP that PP would not meet (Alston 1991: 248-250).

b. Alvin Plantinga

Alvin Plantinga has authored and edited a number of books and essays on reformed epistemology. Plantinga’s earliest work on the topic, God and Other Minds, represents an initial attempt to undermine the evidentialist objection. In God and Other Minds, Plantinga assumes that (2) is generally correct. There isn’t, according to Plantinga, sufficiently good evidence for belief in God—at least not in the way that is demanded by the evidentialist. Plantinga’s approach at this point, then, is to argue that there is a double standard with regard to (1). So while the evidence and arguments for belief in God are far from conclusive, they are, in fact, on par with other beliefs that we take to be rational. For example, as the argument goes, we take the belief that other minds exist to be rational despite the fact that philosophical arguments in its favor suffer many of the same problems that plague traditional theistic arguments. Thus, concludes Plantinga, “if my belief in other minds is rational, so is my belief in God. But obviously the former is rational; so, therefore, is the latter” (1967: 271). This is the first of Plantinga’s so called parity arguments.

In more recent literature, however, Plantinga abandons this earlier parity argument as a way to deal with the evidentialist objector. This is due in part to the fact that in God and Other Minds Plantinga assumed, like the evidentialist objector, that the way to go about discussing the rationality of religious belief was to first consider the evidence in its favor. Here is Plantinga discussing this assumption:

I was somehow both accepting but also questioning what was then axiomatic: that belief in God, if it is to be rationally acceptable, must be such that there is good evidence for it. This evidence would be propositional evidence: evidence from other propositions you believe, and it would have to come in the form of arguments. This claim wasn’t itself argued for: it was simply asserted, or better, just assumed as self-evident or at least utterly obvious. What was taken for granted has now come to be called ‘evidentialism’ (a better title would be ‘evidentialism with respect to belief in God’, but that’s a bit unwieldy). (2000: 70)

Plantinga, then, initially attempted to confront the evidentialist objection by merely pointing out its inconsistent nature. In more recent literature, however, Plantinga adopts a new, bolder approach in response to the evidentialist objection. He directly confronts the evidentialist by showing that it is motivated by a failed theory of justification—namely, classical foundationalism. Crucial to the argument, then, is the belief that the evidentialist objection arises from the influence of classical foundationalism. A detailed response to classical foundationalism is found in chapter 3 of Warranted Christian Belief. The idea presented in WCB is not that (1) is applied inconsistently, but that there is no good reason to think that (1) is true.

As well as this negative approach to challenging the evidentialist objection Plantinga also seeks to offer something more positive. In his book, Warrant and Proper Function, Plantinga seeks to offer an account of warrant—his term for whatever it is that makes the difference between true belief and knowledge. In Warranted Christian Belief Plantinga applies his account of warrant to religious belief and argues that there is no way to show that religious belief is not warranted without first assuming that it is false.

c. Nicolas Wolterstorff

Nicholas Wolterstorff’s defense of some of the central claims of reformed epistemology is perhaps less significant than the previous two figures that we looked at, but his contributions are certainly more wide reaching. His earliest contribution is his book Reason within the Bounds of Religion. In this book Wolterstorff is grappling with the question of how to be a Christian and a scholar and how one’s faith ought to relate to and impact upon one’s reasoning. Though we find no explicit formulation of reformed epistemology here, it is clear that he is attempting to develop a view in which religious beliefs are neither subordinate to nor independent of our other beliefs.

His most explicit contribution to reformed epistemology comes in the collection of essays that he edited with Alvin Plantinga called Faith and Rationality. In his paper entitled “Can belief in God be rational?” he considers what obligations rationality places upon us, and in particular whether rationality requires that we only believe in God on the basis of evidence. Wolterstorff argues that:

A person is rationally justified in believing a certain proposition which he does believe unless he has adequate reason to cease from believing it. Our beliefs are rational unless we have reason for refraining; they are not nonrational unless we have reason for believing. They are innocent until proved guilty, not guilty until proved innocent. (Wolterstorff 1983: 163)

He then turns to applying this to belief in God. He observes that people come to believe that God exists in a variety of ways such as from their parents, or in response to an overwhelming sense of guilt, or by finding peace in the midst of suicidal desperation. In many cases, belief in God seems to be immediate (that is, not based upon other beliefs) and so long as the person who forms the belief has no adequate reason to give up their belief then that belief will be rational.

More recent contributions from Wolterstorff come in his books Divine Discourse and Justice. In the former he is engaged in a philosophical discussion of the claim that God speaks, and in the latter, he is defending an account of human rights. Although these books are not about reformed epistemology they are informed by it. Wolterstorff is still engaged in showing how certain religious beliefs can be rational. Furthermore, Wolterstorff is clearly putting into practice some of the key claims of reformed epistemology. In Justice it is clear that Wolterstorff is seeking to show how some religious claims interact with the discussion of human rights—in doing this, Wolterstorff treats the religious claims as standing on equal footing with the non-religious claims. What this means in practice is that he does not attempt to justify religious claims on grounds acceptable to the non-religious, but neither does he treat religious claims as immune to criticism.

4. Evidence and Rational Belief in God

According to the reformed epistemologist, objections to the rationality of belief in God often revolve around the claim that belief in God lacks the appropriate evidence. In order to see this, we can, following Plantinga, identify two distinct types of objections—namely, the de facto and de jure objections. The de facto objection, historically anyway, is the form many religious objections traditionally take. That is, the religious skeptic often questions the reality or truth of the religious conviction before directly considering epistemic questions. De facto objections take many forms, with perhaps the problem of evil being the most well-known and discussed in philosophical literature. As the argument goes, a benevolent and omnipotent God cannot possibly exist given the amount of unnecessary or gratuitous evil.

In contrast to the de facto objection, there is an epistemic objection—or as Plantinga calls it, the de jure objection. The de jure objection ignores the ontological status of God’s existence and instead focuses on the justification and rationality of belief in God. The de jure objector asks whether belief in God is irrational, unjustifiable, or epistemically irresponsible. This objection comes in various forms as well. For some, belief in God is irrational as it is the result of some cognitive malfunction. Belief in God is so irrational, it is claimed, that it could have only been invented by mad, deluded people who base their belief on insufficient justification or argument. For others, this cognitive malfunction is akin to belief in Santa Claus and not the kind of belief an adult could justifiably believe in. Belief in Santa Claus, for which there is no evidence, is akin to belief in God, for which there is no evidence. No matter which line the de jure objector takes, what seems to unite these objectors is the idea that belief in God lacks the kind of epistemic justification necessary for rational belief. And for many de jure objectors there is the assumption, as Plantinga notes, that having a rational belief in God requires (propositional) evidence in order to have adequate epistemic support. Call this the evidentialist de jure objection. So what motivates the de jure objection, then, is the idea that belief in God both requires and lacks the appropriate evidence. The central claim of the evidentialist position is that one ought to believe only when one has the appropriate evidence. Thus if theism is indeed similar to belief in Santa Claus (for which there is no good evidence), then it seems that belief in God is indeed dubious and the nature of the evidentialist de jure objection becomes a bit clearer: belief in God is rational only if its justification depends on evidence. Theism, however, lacks the appropriate evidence and is therefore irrational.

What makes reformed epistemology unique here is the response that is given in reply to this critcism. The assumed move here would be to try and show that there is adequate evidence for theism. Instead, though, the reformed epistemologist rejects the evidentialist assumption (and on some accounts might even grant that there is insufficient inferential evidence). While there are perhaps several ways to get around the evidentialist assumption, the most well-known account is offered by Plantinga. Plantinga argues, for example, that the evidentialist assumption is undermined given that it is motivated by a failed theory of justification—namely, classical foundationalism.

5. Classical Foundationalism    

In order to undermine the evidentialist objection, reformed epistemologists have sought to argue against what they take to be the underlying epistemological view that motivates the objection. The view that they identify as playing this role they call Classical Foundationalism.

Classical Foundationalism holds that there are two kinds of belief: basic beliefs and non-basic beliefs. The basic beliefs are rational even when not held on the basis of other beliefs, whereas non-basic beliefs are only rational when supported by basic beliefs. The reason why classical foundationalism motivates the evidentialist objection against belief in God is because of the restrictions it puts on what can reasonably be a basic belief—on what is a properly basic belief.

According to the classical foundationalist, the only beliefs that are properly basic fall into to one of the three following categories:

evident to the senses,

incorrigible, or

self-evident.

This means that any belief that does not fall into one of these categories can only be rational if it is supported by beliefs that do fall into these categories. With this framework in place it seems quite easy to formulate the evidentialist objection against belief in God. This is because belief in God does not seem to be evident to the sense, incorrigible or self-evident. Given this, then, we can claim that belief in God is only rational if it is supported by adequate evidence—that is, by other beliefs that are evident to the senses, incorrigible or self-evident.

It is possible to find historical examples of arguments along these lines. For example, here is J. L. Mackie discussing the rationality of belief in God:

If it is agreed that the central assertions of theism are literally meaningful, it must also be admitted that they are not directly verifiable. It follows then that any rational consideration of whether they are true or not will involve arguments… it [whether God exists] must be examined by either deductive or inductive reasoning or, if that yields no decision, by arguments to the best explanation; for in such a context nothing else can have any coherent bearing on the issue. (Mackie 1982: 4, 6)

Mackie is not alone is these demands. John Locke placed similar demands on religious belief by boldly claiming that those who do assent to (religious) belief without evidence “transgress against their own light” and disregard the very purpose of those faculties which are designed to evaluate the evidence necessary for belief.

The reformed epistemologist contends that this view has been the dominant one among both theists and atheists alike, and so the question of whether or not belief in God is rational has focused on whether or not there is adequate evidence for that belief. It is for this reason that reformed epistemologists have seen their first task as being to show why classical foundationalism fails as account of what it takes for a belief to be rational.

a. Rejecting Classical Foundationalism

The case for rejecting classical foundationalism rests on two key arguments. First, classical foundationalism classes a large number of beliefs that we typically take ourselves to know as irrational. Second, classical foundationalism is self-referentially incoherent.

The first problem raised against classical foundationalism is that it classes beliefs such as ‘the world has existed for more than five minutes’, ‘other persons exist’ and ‘humans can act freely’ as not properly basic. These beliefs, claims Plantinga, (along with a great many others) are accepted by the vast majority of rational humans; yet, the arguments for these beliefs are remarkably weak. Most people who believe these things can offer no arguments for their belief, and those who can, still seem to hold the belief with a greater degree of certainty than the argument would seem to warrant. Plantinga writes that the problem of other minds is to explain how it is that the very common belief that other humans have a mental life could be justified. Plantinga thinks that the best argument is the argument from analogy—that we observe that our own mental events such as being in pain are accompanied by certain behaviors, such as grasping the area where the pain is located, and then infer from this that when others are exhibiting similar behavior, they are also having the associated mental event. This inference from a single case hardly seems to justify the belief that there are other minds, but if it can be shown to be sufficient it would still be implausible to claim that only those who have knowledge of the argument are rational in their belief that other minds exist. This, perhaps, would not be so troubling if it were not the case that so many beliefs that do not meet the requirements set down by classical foundationalism are believed in a basic way by most rational humans. Anthony Kenny has pointed out that there are many beliefs that, although we can find some evidence for them, should not be thought of as being based upon that evidence because the evidence is believed with less strength than what it is evidence for. He suggests that the belief that Australia exists is just such a belief:

If any one of the ‘reasons’ for believing in Australia turned out to be false, even if all the considerations I could mention proved illusory, much less of my noetic structure would collapse than if it turned out that Australia did not exist. (Kenny 1983: 19)

The same goes for beliefs such as ‘I am awake’ or ‘human beings die’. If these beliefs can be rational only if they are based upon evidence then the classical foundationalism seems to suggest that we should hold many of our beliefs with much less certainty, and give up many other very strongly held beliefs.

Plantinga’s second objection is that classical foundationalism is self-referentially incoherent. Classical foundationalism itself is not self-evident, neither is it incorrigible, and it is certainly not evident to the senses. This means that if it is to meet its own standards there must be an argument from premises that are self-evident, incorrigible, or evident to the senses. No argument presents itself, and it is certainly difficult to see where one would start, especially in light of some of the counterintuitive consequences of the classical foundationalism highlighted above.

It’s worth noting here that not all reformed epistemologists think the connection between classical foundationalism and evidentialism is so obvious. There are two main lines of criticism that can be made to Plantinga’s arguments against classical foundationalism. The first is to question the link between classical foundationalism and the evidentialist objection, and the second is to claim that Plantinga has failed to show that classical foundationalism is an untenable position.

This first criticism can be found among Plantinga’s fellow reformed epistemologists:

[I]f [Plantinga] is saying that no one has explicitly presented [the evidentialist objection] as following from some other developed and articulated position that is probably true, but it remains to be shown that anyone has done that with respect to classical foundationalism either. But if the claim is that no other epistemological theory could plausibly serve as a reason for the evidentialist denial, that is palpably false. (Alston in Tomberlin and van Inwagen 1985: 296)

[Plantinga’s] discussion puts us in the position of seeing that the most common and powerful argument for evidentialism is classical foundationalism, and of seeing that classical foundationalism is unacceptable. But to deprive the evidentialist of his best defense is not yet to show that his contention is false. (Wolterstorff 1983: 142)

The criticism from Alston and Wolterstorff is that Plantinga has done nothing to persuade us that the evidentialist objection has no force; at best he has shown that no previous articulation of the objection is successful (supposing that it is correct that all previous versions of the argument rely on something very much like classical foundationalism).

The second response to Plantinga can again be found in Alston (Alston in Tomberlin and van Inwagen 1985: 296-299). Alston observes that Plantinga has not shown that the defender of classical foundationalism cannot argue for classical foundationalism from premises that are properly basic by her lights. Alston agrees that it is hard to see how this might be done but denies that this supports the conclusion that it cannot be done.

Plantinga’s critique of classical foundationalism noted above might be understood as a negative approach. The responses from Alston and Wolterstorff, then, are directed at this negative approach. Plantinga, however, also offers a different, more positive approach to the issue of proper basicality. He asks us to reconsider what might be classified as properly basic. Rather than select criteria, and then categorize our beliefs accordingly, we should amass examples of beliefs that we take to be properly basic, and the circumstances in which they are considered properly basic. After this process, Plantinga suggests that one could then propose criteria following reflection on these examples. Though, it’s important to keep in mind that not all of the example beliefs will qualify as genuinely properly basic (despite any initial appearances to the contrary).

But who is to decide the set of examples, and how do we weed out bad examples without any criteria? Plantinga deliberately gives no definitive answers to these questions. According to Plantinga, it is the responsibility of each community to decide what it considers to be properly basic and to take that as a starting point; there can then be an exchange between the examples and the criteria that they are used to justify, each refining the other. The claim is not that those beliefs that are held by one’s own community to be properly basic are properly basic; rather, the claim is that this is the best starting point for enquiry. It may be that your community has got it wrong about what beliefs are properly basic, but hopefully this will be revealed by further reflection.

According the reformed epistemologist, there is no neutral starting point for philosophical enquiry, so it is up to each community to assess their own starting point, and take that as a defeasible foundation for inquiry. Communities are not free, however, to decide what beliefs are basic for them. What we believe is rarely within our own control—for example, one cannot simply decide to believe that the moon does not exist. This means that there is an objective fact about what each community does take as its starting point.

It might be objected that this is arbitrary, but Plantinga contends that there is no set of beliefs that will be entirely uncontroversial, and there is no criteria of proper basicality that is more convincing than the beliefs that most people take as properly basic. Or perhaps some will agree that although this method is correct, it is still implausible that belief in God should be properly basic. In the case of perceptual beliefs the ground for them is obvious, even if how they are grounded is not clear. God, if he exists, is surely much more remote, and his existence is not the sort of thing that can be known in the basic way.

Plantinga responds by pointing out that, within the Reformed tradition at least, belief in God is considered to be grounded. According to John Calvin, one of the important figures in the Reformation, humans each have a natural tendency to believe that God exists when placed in certain circumstances, in fact he claims that God “daily discloses himself in the whole workmanship of the universe” (Plantinga 2000: 66). Plantinga does not argue for the truth of such a position, rather, he mentions it to show that his claim that belief in God can be properly basic is not ad hoc, but is in fact implicitly the view held by a large number of people, and the Reformed tradition more specifically. It is not necessary that Plantinga know, or even have good reason to believe the claims made by Calvin and others, as long as it is true that there are experiences that serve to ground belief in God then that belief will be properly basic on those occasions. It is due to this appeal to reformed thinkers that this view has come to be known as reformed epistemology.

On the surface, reformed epistemology bears some similarity to fideism. Fideism is the claim that belief in God is not rational, but must be accepted upon faith; it is usually claimed that this belief is independent of reason, or in more extreme cases that it is opposed to reason. The reformed epistemologist will agree with the fideist that arguments are not needed to justify belief in God, but what about the relationship between reason and belief in God?

It is clear from what has already been discussed that the reformed epistemologist will not subscribe to the more extreme fideism because to believe what is properly basic is not to believe what is opposed to reason. What is, at first, less clear is whether to believe in God in the basic way is to believe independently of reason. Plantinga considers a distinction between reason and faith suggested by Abraham Kuyper (Plantinga 1983: 88), that the deliverances of reason are those beliefs that are based on argumentation and inference, whereas the deliverances of faith are beliefs that are held independently of argument and inference. On this understanding of faith, anything held in the basic way will be taken on faith. For example, this definition would suggest that 2+1=3, external objects exist and I am awake, are all held on faith. This is not the understanding of faith that the fideist has in mind, since it does not serve to draw a distinction between faith and reason. Plantinga explains that there is no reason for the reformed epistemologist to think that belief in God is independent of, or opposed to, reason:

Belief in the existence of God is in the same boat as belief in other minds, the past, and perceptual objects; in each case God has so constructed us that in the right circumstances we form the belief in question. But then the belief that there is such a person as God is as much among the deliverances of reason as other beliefs. (Plantinga 1983: 90)

Reformed epistemologists, unlike fideists, hold that religious belief is rational, but unlike the evidentialist, they deny that this rationality is due to the beliefs being based upon evidence.

6. The Positive Case in Reformed Epistemology

So far, much of what has been said here has been focused on undermining a certain sort objection to the rationality of religious belief. The second significant strand to reformed epistemology concerns providing a description of the way in which religious beliefs can be rational.

a. The Christian Mystical Practice

In Perceiving God William Alston seeks to describe and defend what he calls the Christian Mystical Practice (CMP). This is the practice of forming beliefs about God in response to certain kinds of experiences.

Alston first argues that there are no non-question-begging way to show that any basic belief forming practice is reliable—one will always have to appeal to the practice itself. In light of this we cannot require that belief forming practices enjoy independent support before we engage in them because this support will never be available. It may be that some practices can be ruled out due to being inconsistent, but no adequate reason can be found for thinking that any of our basic belief forming practices are reliable.

Instead Alston argues that it is reasonable to accept socially established practices; those practices that have demonstrated stability over a number of generations and which are deeply embedded in our psyche. Such practices provide prima facie justification for the beliefs that they produce. Furthermore, if these practices are not shown to be unreliable then the beliefs that result from them are rational.

Alston claims that CMP is one of these practices. Christians have been forming beliefs in this way for centuries, and the practice is deeply embedded in the culture. This means that engaging in the practice is prima facie justified. And as long as there are no adequate reasons for thinking that CMP is unreliable then the beliefs that result from this practice will be justified.

Alston goes on to argue that many of the reasons for thinking that CMP is unreliable exhibit one or both of two flaws: imperialism and double standards. Objections such as that CMP must be unreliable because most normal adults do not practice it is, Alston argues, guilty of imperialism. It imposes a standard on CMP that requires it to be more like the Sense-perceptual Practice (SP) for no good epistemic reason. Why should we expect practices that are used by all the population to be the only ones that are reliable? An example of an objection that imposes a double standard would be requiring that the outputs of CMP be independently verifiable. Alston argues that no basic belief forming practice meets this requirement including SP, so requiring something like this of CMP is to apply a standard that one would not apply across the board.

b. The Parity Argument

The beginnings of the parity argument can be seen in Plantinga’s early writings as far back as God and Other Minds. There, Plantinga argues that belief in other minds and belief in God are in the same epistemological dilemma; all of the arguments in their favor fall short when it comes to philosophical scrutiny. Yet, as Plantinga states, “if belief in other minds is rational, so is my belief in God. But obviously the former is rational; so, therefore, is the latter.” As Plantinga’s thinking has developed, so has his parity argument as it relates to rational belief in God. The key difference in his thinking, as he notes in Warranted Christian Belief, is that he no longer takes proofs as the only way to justify belief in God. This major shift in Plantinga’s thinking opens the door for a more daring parity argument, namely that in the same way that perceptual experiences are justified, belief in God—through the divine sense—is also justified and should thus enjoy the same epistemic status as ordinary perceptual experiences.

Plantinga’s parity argument for rational belief in God follows a specific pattern. The first goal is to highlight those beliefs that we take to be both rational and basic. In other words, it needs to be the kind of belief that is rational despite not being inferred from any evidence or argument. Further, it must be the sort of belief that if held hostage to evidential demands it would have devastating epistemological results; perceptual beliefs, it is thought, are specifically what Plantinga is looking for. Consider for example the belief that I see a clock hanging on the wall. It would be difficult to present any non-circular or non-question begging evidence to justify my belief. Yet, this is what the evidentialist demands. So if we can disregard the demands of the evidentialist in the case of perceptual beliefs, then perhaps the demands the evidentialist places on belief in God should be reconsidered as well; neither can produce the required (non-question begging) evidence, but surely in the case of our perceptual beliefs it can’t be said that we as agents are unjustified, epistemically irresponsible, or irrational in our belief. This of course raises further questions about evidential demands. This, then, is the first parallel that Plantinga and other reformed epistemologists make. The second parallel deals with the similarities between perceptual and religious experiences.

Perceptual beliefs arise from some perceptual experience; the belief arises suddenly with the cognizer having no control over the initial belief. The perceptual belief that arises from the experience is prima facie justified. Thomas Reid, whose influence on reformed epistemology is of note, argued that what we perceive is not “only irresistible, but it is immediate; that is, it is not by train of reasoning and argumentation that we come to be convinced of the existence of what we perceive.” Perceptual beliefs, according to Reid, are not inferred but immediately known by the perceiver. The parallels between perceptual beliefs and belief in God, on Plantinga’s account anyway, are important. The idea is that belief in God and perceptual beliefs are both immediate and the result of our cognitive faculties. Thus, if some perceptual belief like “I see a tree” is prima facie justified, then belief in God, if it arises in the same manner (for example, the result of some cognitive faculty), is also prima facie justified.                 

So what is this special faculty that gives rise to belief in God in an immediate non-inferential fashion? Plantinga uses a term that is well known to most in the reformed tradition called the sensus divinitatis. Calvin, who Plantinga credits with the sensus divinitatis, claimed that one can accept and know that God exists without any argument or evidence. As a result of the workings of the sensus divinitatis, belief in God is properly basic and is not inferred from any evidence or argument. Plantinga’s position is summed up nicely here:

Calvin’s claim, then, is that God has created us in such a way that we have a strong tendency or inclination toward belief in him. This tendency has been in part overlaid or suppressed by sin. Were it not for the existence of sin in the world, human beings would believe in God to the same degree and with the same natural spontaneity that we believe in the existence of other persons, an external world, or the past. This is the natural human condition; it is because of our presently unnatural sinful condition that many find belief in God difficult or absurd. The fact is, Calvin thinks, one who does not believe in God is in an epistemically substandard position—rather like a man who does not believe that his wife exists, or thinks she is likely a cleverly constructed robot and has no thoughts, feelings, or consciousness. Although this belief in God is partially suppressed, it is nonetheless universally present. (Plantinga 1983: 66)

From this, Plantinga concludes that “there is a kind of faculty or cognitive mechanism, what Calvin calls sensus divinitatis or a sense of divinity, which in a wide variety of circumstances produces in us beliefs about God.” So in the same way that perceptual beliefs such as “I see a table” are non-inferential and properly basic, belief in God, when occasioned by the appropriate circumstances (such as one feeling a sense of guilt, dependence, beauty, and so forth), can also be properly basic because of the cognitive working of the sensus divinitatis.

On Plantinga’s reformed account then, belief in God can now be added to the list of properly basic beliefs:

  1. I see a tree (known perceptually),
  2. I am in pain (known introspectively),
  3. I had breakfast this morning (known through memory), and
  4. God exists (known through the sensus divinitatis).

This belief can be taken as properly basic if the agent’s belief has sufficient warrant.

There is another important question to be asked, however. Does it follow from this that belief in God is groundless? If I come to believe in God on the reformed model, can it be said that my belief is groundless? Plantinga argues that in the same way that “I see a tree” is properly basic but not groundless, belief in God is not groundless. Understanding what Plantinga means by “groundless” is important in realizing the distinction between evidence and grounds for belief. Perceptual experiences, such as those caused by visual experiences, are not considered to be groundless because of their reliance on the senses. Likewise, Plantinga claims that belief in God is not groundless, because it is rooted in the experience of the sensus divinitatis. These experiences, however, do not entail that the belief in question is inferential. The belief is merely occasioned by the circumstance (for example, the circumstance of beholding some majestic mountains or desert sunset) which triggers the working of the sensus divinitatis. Those who believe in God simply find themselves with this belief.

Another important point concerns defeaters against belief in God. Plantinga argues that while belief in God is properly basic, it is also open to defeat. Suppose that someone offers a defeater for the belief that God exists; then, claims Plantinga, that particular belief would have to be abandoned. It is possible however, for one to offer a defeater-defeater, which would obviously entail the belief being justifiably maintained. This is an important point in that we can now see that a properly basic belief, for Plantinga, is not some incorrigible or indubitable belief that one can always believe despite defeating evidence. It is, in other words, properly basic but open to defeat.

c. Warranted Christian Belief

Alvin Plantinga has developed an important account of how religious belief could amount to knowledge. This view is discussed in his trilogy: Warrant: The Current Debate, Warrant and Proper Function, and finally, Warranted Christian Faith. In this Warrant trilogy, Plantinga is interested in the question “What is knowledge?”, and more specifically in what it is that makes the difference between mere true belief and knowledge. He calls this, whatever it is, warrant.

Warrant is just one of a number of epistemic terms that are used in epistemology; others include justification, rationality and evidence. Warrant is of particular importance, however, because if we can answer the question “What is warrant?” then we will have an answer to the question “What is knowledge?”

Plantinga argues that warrant results from the proper functioning of your cognitive faculties:

[A] belief has warrant for me only if (1) it has been produced in me by cognitive faculties that are working properly (functioning as they ought to, subject to no cognitive dysfunction) in a cognitive environment that is appropriate for my kinds of cognitive faculties, (2) the segment of the design plan governing the production of that belief is aimed at the production of true beliefs, and (3) there is a high statistical probability that a belief produced under those conditions will be true. (Plantinga 1993: 46-47)

Key to Plantinga’s analysis of warrant is that a belief can only be warranted if it is produced by a cognitive faculty that is functioning properly, which means that it must not be diseased or broken or hindered. In order to make sense of what it means for our cognitive faculties to be functioning properly we must introduce the notion of a design plan, which determines the way our cognitive faculties are supposed to work. Just as the human heart is supposed to beat at 50-80 beats per minute while at rest, so too, there is a way that our cognitive faculties are supposed to function. This, claims Plantinga, should not be thought to necessarily invoke the notion of conscious design (by God, or anyone else), rather he means to invoke the common idea shared by many theists and non-theists, that parts of our bodies have a function, such as one of the functions of our legs being to allow us to move through our environment.

As well as having cognitive faculties that are functioning properly those faculties must also be operating in the right cognitive environment—the one for which they are designed. This means that one might have warrant for a perceptual belief that is formed about a nearby medium sized object on a clear day, but not for a perceptual belief about a far-away object in a badly lit, smoke-filled room. It must also be that the part of the design plan governing the production of the belief in question must be aimed at truth. Our faculties are designed for a number of different purposes, not just the production of true beliefs, which means that it may be that there are times when our cognitive faculties are functioning properly in the correct environment, and yet produce a false belief, or a belief that is only accidentally true. For example, it may be the case that when a person discovers that they have a life-threatening illness that they are designed in such a way that they will come to believe that they will recover, even if this unlikely to be true—this may perhaps be the case because one is more likely to recover if one believes that this is true. That would be a case of cognitive faculties functioning properly in the correct environment, but not a case of the belief being warranted because the design plan, in this instance, does not aim at truth.

The final requirement is that there is a high statistical probability that a belief that is produced by the cognitive faculty in question is likely to be true when it is functioning properly in the environment for which it was designed—which is to say that the design must be a good one. Plantinga imagines a situation in which our faculties have been designed by some lesser deity, and that this deity has done such a poor job, that even when our faculties are functioning properly, in the correct environment, according to a design plan that is aimed at truth, we still form mostly false beliefs because the design is so poor. If this was the case then our beliefs would not have warrant, even in cases where they did turn out to be true. For this reason a reliability condition is required as well.

One important point to note is that Plantinga’s account is an externalist one. This means that, on Plantinga’s view, warrant involves, not just facts that the agent is aware of, but also facts that the agent may not be aware of; such as, for example, whether one’s faculties are functioning properly and facts about the environment. This point is crucial to Plantinga’s account given that whether or not a theist has warrant for her religious beliefs may depend on facts that she is unaware of.

Plantinga claims that given this view in epistemology there is no good reason to think that religious belief is not warranted. Plantinga claims that, following John Calvin, we may have been created by God with a faculty called the sensus divinitatis. Any beliefs that result from this faculty will be in a position to be warranted. So long as the faculty was designed by God for the purpose of producing true beliefs about him then this faculty will meet the requirements described above and the resulting beliefs will be warranted.

It is not Plantinga’s intention to show that this faculty exists or that this really is the way that religious beliefs come about. Instead his claim is that since this is true for all we know then one cannot reasonably claim that religious beliefs are not rational without first showing that this account is false.

7. Objections to Reformed Epistemology

Reformed epistemology has received a significant amount of attention and attracted many objections. Some of the most significant ones are described below.

a. Great Pumpkin Objection

There is a family of objections known as Great Pumpkin objections. These objections get their name from the Peanuts comic strip. In peanuts the character Linus is a child who believes that each Halloween the Great Pumpkin will come to visit him at the pumpkin patch. What these objections have in common is that they claim that, if reformed epistemology is correct, then belief in God is no more rational than belief in the Great Pumpkin.

This kind of objection is first mentioned by Plantinga in “Reason and Belief in God” (74-78). One of the claims of reformed epistemology is that the religious believer need not offer any criteria for deciding which beliefs are reasonable starting points for forming further beliefs. Instead each community is responsible for determining its own starting points and reasoning on that basis. Plantinga supposes that someone might object to this by claiming that this method means that the community in question will have no reason to accept any belief over any other. This community could take belief in God to be properly basic, but they might instead take the belief that the earth is flat or that I can run at the speed of light if I try really hard, or the belief that the Great Pumpkin will return at Halloween to the most deserving pumpkin patches. There is no reason, so the objection goes, to choose one belief over another without first offering some criteria for determining which beliefs are rational starting points and which are not.

Plantinga points out that in other areas we are able to discriminate between two things even if we are not able to give criteria for how that discrimination is to be done. The example he gives is the meaningfulness of sentences. Plantinga observes that we can easily tell that the sentence “T’was brillig; and the slithy toves did gyre and gymble in the wabe” is meaningless even if we cannot appeal to some general criteria of meaning. Likewise, claims Plantinga, there is no reason to think that something similar will not be possible for beliefs. This example shows that there is nothing mysterious about the suggestion that we might be able to tell which candidates belong to a certain class, and which do not, without also being able to state criteria for inclusion. For these reasons this objection need not trouble the reformed epistemologist.

Michael Martin offers a more troubling version of the argument. He does not label his objection as a Great Pumpkin objection, but Plantinga refers to it as the Son of the Great Pumpkin objection. Here is how Martin phrases the objection:

Although reformed epistemologists would not have to accept voodoo beliefs as rational, voodoo followers would be able to claim that insofar as they are basic in the voodoo community they are rational and, moreover, that reformed thought was irrational in this community. Indeed, Plantinga’s proposal would generate many different communities that could legitimately claim that their basic beliefs are rational. (Martin 1990: 272)

This second objection concerns whether or not a community can make judgments about the basic beliefs of other communities in a principled way. They may be able to argue that the believers in some other community are not justified in holding some of their non-basic beliefs, because they are not adequately supported by their basic beliefs, but since the basic beliefs are not supported by other beliefs, there seems to be no way for those outside the community to criticize them. If this is correct, it is a very strange and counter-intuitive result. There are various beliefs that we think are objectionable, even if they are held in the basic way; for example, belief that the Great Pumpkin will return every Halloween, that the Earth is flat and the claims of astrology all seem to be objectionable from the epistemic point of view, whether or not they are held in the basic way.

The reformed epistemologist regards the process of assembling examples of properly basic beliefs to be the responsibility of each community, and so, it would seem, at least at first, that she is committed to a sort of epistemic relativism whereby the most one can do to criticize the beliefs of a person from a different community is to point out internal inconsistencies. This wouldn’t necessarily be a major problem, except for the fact that the sorts of communities that seem to be included are ones that hold bizarre, irrational or superstitious beliefs—beliefs like astrology, voodoo or perhaps even the Great Pumpkin belief.

The reformed epistemologist can respond to this objection by pointing out that one could challenge the basic beliefs of another community by finding a defeater. Our basic beliefs are defeasible, and therefore open to revision in light of further information. This means that just because you are permitted to treat a belief as properly basic if it seems to you that it is, it does not follow that you will continue to be permitted to hold that belief no matter what. You may gain a defeater for that belief and come to believe that it is no longer true. A person may be justified in taking a belief such as the Great Pumpkin belief as basic if she has been raised to believe that the Great Pumpkin exists, but when she comes to learn more about the world—for example, when, yet again, the Great Pumpkin fails to arrive on Halloween—she will obtain a defeater for that belief, and it will no longer be reasonable for her to hold that belief.

The reformed epistemologist is therefore not endorsing an epistemic free-for-all, since just because a belief is basic does not mean that it is immune to epistemic appraisal. It is still perfectly possible for anyone to argue against the basic beliefs of another community, and to show them that one of their beliefs is false or unjustified.

The third, and final, version of this objection claims that reformed epistemology places belief in God beyond epistemic appraisal and that its methods could be adapted to place other beliefs beyond epistemic appraisal—beliefs that are clearly irrational like belief in the Great Pumpkin. If the methods of reformed epistemology can be used to defend beliefs like these then it cannot be successful in establishing the rationality of religious belief.

Linda Zagzebski has offered an objection like this one. She claims that reformed epistemology has failed to meet the requirements of what she calls the “Rational Recognition Principle (RRP): If a belief is rational, its rationality is recognizable, in principle, by rational persons in other cultures” (Zagzebski in Plantinga et al. 2002: 120). Zagzebski directs her objection against Plantinga and writes that reformed epistemology

violates the Rational Recognition principle. It does not permit a rational observer outside the community of believers in the model to distinguish between Plantinga’s model and the beliefs of any group, no matter how irrational and bizarre—sun-worshippers, cult followers, devotées of the Greek gods . . . , assuming, of course, that they are clever enough to build their own epistemic doctrines into their models in a parallel fashion. But we do think that there are differences in the rationality of the beliefs of a cult and Christian beliefs, even if the cult is able to produce an exactly parallel argument for a conditional proposition to the effect that the beliefs of the cult are rational if true. Hence, the rationality of such beliefs must depend upon something other than their truth. (Zagzebski in Plantinga et al. 2002: 122)

A similar objection is offered by Keith DeRose in his unpublished essay “Voodoo Epistemology.” DeRose argues that the real worry for reformed epistemology is that it could be adapted to defend some very strange and clearly irrational beliefs. This, claims DeRose, shows that there is something wrong with reformed epistemology even if we cannot say exactly what it is.

This objection is not completely devastating for reformed epistemology but it does make the achievements of reformed epistemology look much less significant. Work in this area by Kyle Scott (2014) has suggested that we ought to consider the historical and social environments that beliefs occur in, arguing that only beliefs that occur in stable and enduring communities are viable candidates for being defended in the way that reformed epistemologists defend religious belief.

b. Disanalogies

An important claim made by reformed epistemology is that religious belief can be rationally held in the basic way, similar to perceptual beliefs. An objection to this is that it cannot be reasonable to hold religious beliefs in the basic way because of significant differences between perceptual beliefs and religious beliefs. The objection has been most forcefully put by Richard Grigg (1983). He does not think that theistic beliefs will turn out to be basic because of the disanalogies between theistic beliefs and more widely recognized basic beliefs.

Grigg interprets reformed epistemology as arguing that the Christian community is within its epistemic rights in holding that certain theistic beliefs are basic because these beliefs are analogous to other beliefs that are more widely regarded to be basic. Examples of these include: (1) I see a tree, (2) I had breakfast this morning, and (3) That person is angry. Grigg identifies three important disanalogies between these beliefs and theistic beliefs.

Firstly, Grigg points out that although beliefs such as (1)-(3) will often be basic, they are still constantly being confirmed:

For example, when I return home this evening, I will see some dirty dishes sitting in my sink, one less egg in my refrigerator than was there yesterday, etc. This is not to say that (2) is believed because of evidence. Rather, it is a basic belief grounded immediately by memory. But one of the reasons that I take such memory beliefs as properly basic is that my memory is almost always subsequently confirmed by empirical evidence. (Grigg 1983: 126)

This, on the other hand, is not true of theistic belief. Beliefs, such as that God created the world, Grigg suggests, are not confirmed by observation, and may even be disconfirmed if the problem of evil is a successful argument.

The second disanalogy is that there is a certain universality enjoyed by beliefs such as (1)-(3), but not by theistic beliefs. That is, when a person has a perceptual experience such as being appeared to treely, they will naturally believe something like “I see a tree”; and this is the case, claims Grigg, for the vast majority of people. The situation is not the same for theistic beliefs; take, for example, Plantinga’s suggestion that one might have an experience of being awed by the beauty of the universe and form the belief that God created the universe. Grigg claims that many people have this experience yet there is no universally shared belief that typically comes with this experience, unlike in the case of perceptual beliefs.

The third, and final, disanalogy that Grigg raises is that people have a bias towards theistic beliefs, but not usually with less controversial examples of properly basic beliefs. Grigg points out that there is a psychological benefit to be gained from believing that God exists, whereas, there will not usually be any obvious benefit for beliefs like (1)-(3).

Each of these disanalogies can be challenged. Mark Macleod points out that it is not obvious that these are genuine disanalogies. For example, religious beliefs may receive confirmation from multiple sources such as sacred writings, the testimony of other believers and further religious experiences. Although these sources are not independent of each other it is not clear that the experiences in the breakfast example above are independent either since all the supporting evidence relies on perceptual experience at some point.

The second disanalogy is problematic as well because when a person has an experience of seeing a tree they may form a wide variety of belief such as “I see a tree” or “that tree is about to fall over” or “it is very windy today”. Contrary to what Grigg argues the beliefs that are formed in response to perceptual experiences are not uniform.

The third disanology is also not clearly a genuine disanology. I may derive psychological benefit from many of my perceptual beliefs such as believing that the computer screen is showing a positive number next to my bank account.

Even if the case for disanalogies between perceptual experiences and religious experience can be proved, then, this may not be a problem for reformed epistemology. Reformed epistemology should not be understood as relying on the claim that religious experience is just like perceptual experience. Rather what reformed epistemologists have been arguing for is that we ought to judge religious experience by the same standards as we judge perceptual experiences, and that religious experience stands up well when judged by those standards. Given the difference in subject matter and the alleged faculties involved, then, it should not be surprising to find disanalogies between religious experience and perceptual experience. To develop any disanalogies into an objection to reformed epistemology it must also be shown that the disanalogies are sufficient to show that such beliefs are not rational unless supported by further evidence.

c. Religious Diversity

According to reformed epistemology religious belief can be rational even if it is not supported by evidence. What reformed epistemologists do not claim is that these beliefs will be immune to defeat. It may be that a person’s religious beliefs are initially irrational, but when they discover some new piece of information they cease to be. Some have suggested that, even if reformed epistemology is correct, there is a defeater for religious belief that ought to be apparent to most competent adults in the world today. This defeater comes from considering the facts of religious diversity. In this section we will consider two attempts to advance this sort of objection.

i. Religious Belief is Epistemically Arbitrary

Suppose, for the sake of argument at least, that all of the major religions might be equally well supported by arguments and that its adherents might all have the same sort of internally available markers for their beliefs. The scenario would be one where whatever the theist can offer in support of her beliefs, those who disagree can offer the same considerations. For example, suppose that Anne believes p and Bill believes ¬p, and that whatever evidence or arguments Anne can offer in support of p Bill can offer equally good evidence and arguments in support of ¬p. Suppose further that their beliefs are alike in all other respects, so that if Anne finds p intuitive, Bill finds ¬p intuitive; or if Anne takes p as foundational Bill takes Øp as foundational; and so on for any other considerations that might be epistemically relevant. John Hick claims that if this is the case then it is intellectually arbitrary for the religious believer to hold that her own beliefs are true while those of other religions are false because she has no reason to treat the beliefs differently.

Richard Feldman also offers a similar objection by arguing for the following principle:

If (i) S has some good reasons (‘internal markers’) to believe P, but (ii) also knows that other people have equally good reasons (‘internal markers’) for believing things incompatible with P, and (iii) S has no reason to discount their reasons and favor her own, then S is not justified in believing P. (Feldman 2003: 88)

This principle states that even if you have good reasons for believing p, if you know that others have equally good reasons for believing something incompatible with p, and you have no reason to discount their reason then you are not justified in accepting p. This is because, claims Feldman, learning that others have equally good reasons for their incompatible beliefs undercuts your justification for p.

Alvin Plantinga has responded to this objection by trying to show that there is nothing inconsistent about holding onto your beliefs in the face of disagreement—even in the circumstances described above.

His first point is that the internal support that a belief enjoys does not exhaust everything that can be said about the epistemic status of a belief. Two beliefs can have all the same “internal markers” and yet still not be equal from the epistemic point of view. Other relevant features include whether or not the faculty that produced the belief is functioning properly, and whether or not the belief was produced in an environment for which the faculty was designed. Furthermore, one does not need to endorse Plantinga’s epistemology in order to agree with this point. Others have suggested that external factors are relevant to the epistemic standing of a belief; such as reliability of the source of the belief, whether the belief is safe or whether the belief is sensitive. What this means is that there is no inconsistency in thinking that two incompatible beliefs are alike in purely internal support and yet for us to treat them differently. This is a very modest claim and supplies no reason to think that judging two such beliefs differently in the sorts of cases described can be justified, only that it is not contradictory to do so. This point is supposed to lay the basis for his following two points.

The second point is that if disagreement is a defeater then it would defeat too many beliefs. Plantinga labels it a “philosophical tar baby,” claiming that it would be a problem not just for him, but for his objectors as well. This is because whatever position one adopts in this debate there will be others who disagree. The Christian will believe certain claims knowing that others in similar epistemic situations disagree, as will the Hindu or the Muslim. An atheist or a pluralist will be in no better a situation since she will think that the claims of these religions are false, and know that there others who disagree. Plantinga does not think that withholding belief avoids the problem either since if one withholds belief there will still be disagreement concerning whether or not withholding belief is the correct epistemic attitude to adopt. This worry also extends to other areas as well, such as politics and philosophy where there is also widespread disagreement. What this is supposed to show is that claiming that disagreement is a defeater has potentially disastrous consequences leading to a sort of skepticism. This, of course, does not show that it is wrong that disagreement defeats belief, it is only meant to show that this problem is a problem for everyone, and it is not one that is solely a problem for the religious believer.

Plantinga’s third point is offered by way of a thought experiment:

Perhaps you have always believed it deeply wrong for a counselor to use his position of trust to seduce a client. Perhaps you discover that others disagree; they think it more like a minor peccadillo, like running a red light when there’s no traffic; and you realize that possibly these people have the same internal markers for their beliefs that you have for yours. You think the matter over more fully, imaginatively recreate and rehearse such situations, become more aware of just what is involved in such a situation (the breach of trust, the breaking of implied promises, the injustice and unfairness, the nasty irony of the situation in which someone comes to a counsellor seeking help but receives only hurt) and come to believe even more firmly the belief that such an action is wrong… (Plantinga 2012: 653)

Plantinga claims that in moral cases, such as this one, it is clear that it is reasonable to continue believing in the face of disagreement even when you believe that those who disagree enjoy the same internal markers as yourself. If it is reasonable in this case to continue to hold on to your beliefs then it cannot be true in general that one is required to give up beliefs in the face of disagreement.

Plantinga thinks that these three considerations are sufficient to diffuse the charge of arbitrariness. His claim is that if we endorse something like Feldman’s principle above then we will be forced to give up many of our beliefs (possibly including beliefs about the principle itself) and in particular this does not fit with our intuitions about what it is rational to do in the case of moral disagreements like the one Plantinga describes above.

These responses do something to help neutralize the arbitrariness charge but they do not adequately deal with it. What Plantinga has achieved is to show that we cannot always be rationally required to give up our beliefs in the face of disagreement. But that is not sufficient to respond to the problem because there are examples where it does seem to arbitrary to hold on to your belief. An example often discussed in the literature is the restaurant case.

Suppose that Anne and Bill are in a restaurant with friends. The time comes to pay the bill and they both decide to figure out how much everyone owes. Anne believes that everyone owes $23, but Bill believes everyone owes $24. Each considers the other to be just as good at mental arithmetic and they have no reason to suspect that one of them is impaired on this occasion. In this example it seems clear that it would be irrational for Anne to hold on to her belief that everyone owes $23 even if it turns out that she is correct. She seems to have no good reason to prefer her own belief other than that it is her own.

What this suggests is that it cannot be either that disagreement always requires us to revise our beliefs or that it never requires us to revise our beliefs. What is needed is a more sophisticated epistemology of disagreement that lies somewhere between these two extremes. But Plantinga has given us no reason to think that religious beliefs will remain rational in the face of disagreement under this more reasonable epistemology of disagreement. What is needed here is a better understanding of the epistemic implications of disagreement and how that relates to religious disagreement. Fortunately, there is an active debate on this topic and it is likely that one’s opinion on that debate will determine whether or not one believes that this is a successful objection.

ii. Competing Belief Forming Practices

One of the central claims of reformed epistemology is that what determines whether religious belief is rational is not the evidence that a believer can present, but facts about the faculty that produced the belief. The facts of religious diversity offer a way to mount an argument that concludes that we have good reason to think that the faculty that produces religious belief is unreliable.

Before looking at a serious version of this argument it will be instructive to look at a naïve version of the argument and why it fails. This version of the argument observes the wide variety of religious beliefs in the world and notes that many of them contradict each other. Given this disagreement it seems clear that religious belief forming methods are unreliable because, even if some of the beliefs are correct, most of them must be false. Given the wide diversity of religious beliefs, most of these beliefs must be false. This objection is not too troubling since it assumes that there is a single religious belief forming practice. That is, however, implausible. There are significant differences in the practices of different religious practitioners, so the diversity of belief is not evidence that all religious belief forming practices are unreliable.

This objection can be developed further by observing that when it comes to religious matters there are competing methods. These competing methods frequently produce contradictory beliefs. At most, one of these methods can be reliable, but if we have no independent (that is, independent of religious belief forming methods) reason to prefer one over the others then we ought to refrain from engaging in any of them.

William Alston raises this objection against his own view. He compares it to the following situation:

Consider ways of predicting the weather: various ‘scientific’ meteorological approaches, going by the state of rheumatism in one’s joints, and observing groundhogs. Again, if one employs one of these methods but has no non-question-begging reason for supposing that method to be more reliable than the others, then one has no sufficient rational basis for reposing confidence in its outputs. (Alston 1991: 271)

It seems clear, when it comes to choosing between methods for predicting the weather, that if we have several competing methods we ought not accept any of them until we find some reason to prefer one over the other.

Alston responds to this objection by pointing out that there is an important difference between the religious case and the weather prediction case. When it comes to predicting the weather we know what sort of evidence we would need to choose between these methods—we can observe which one is getting it right. Things are different for the religious case because we do not know what reasons we could have for choosing one of these methods over another. The methods in question in the religious case are our only access to the topic—independently of these methods it is difficult to see what reasons we could have for preferring one over another. In light of this, Alston suggests that one cannot be faulted for lacking reasons to prefer one’s own religious belief forming methods.

d. Sensible Evidentialism

One of the central claims of reformed epistemology is that evidentialism with respect to belief in God is misguided. Stephen Wykstra argues that reformed epistemologists (or basicalists, as he calls them) have poorly framed the debate between themselves and evidentialists. He has sought to relocate the debate about the proper basicality of belief in God by contrasting reformed epistemology not with what he calls Extravagant Evidentialism (EE) but with Sensible Evidentialism (SE).

EE is the claim that a person’s belief is only rational if it is either basic, or that person can present propositional evidence for their belief. If we use this to define basic and non-basic beliefs then beliefs that arise from testimony or memory will often be basic. Since these beliefs are basic and belief in God often derives from memory or testimony, then in most cases the EE Objection to belief in God will not amount to much.

Wykstra, however, claims that EE is not the best way to understand the notion of needing evidence. He highlights this by using the example of belief in electrons. Most adults believe in electrons, but very few hold this belief on the basis of evidence. Most of us believe in electrons because we have been told that they exist by scientists, or teachers or some other knowledgeable person. According to the reformed epistemologist this belief will often be basic, and so it will be immune to the evidentialist objection. This is only true if we understand evidentialism as a demand that evidence be produced for each belief by the believer. This fails to take into account that, although the believer in electrons need not be able to produce evidence, the belief is still in some sense in need of evidence. Wykstra asks us to consider the following possible situation:

Suppose we were to discover that no evidential case is available for electrons—say, that the entire presumed case for electrons was a fraud propagated by clever con-men in Copenhagen in the 1920s. Would we, in this event, shrug our shoulders and continue unvexedly believing in electrons? Hardly. We would instead regard our electron belief as being in jeopardy, in epistemic hot water, in (let us put it) big doxastic trouble. (Wykstra 1989: 485)

The electron belief may not need evidence to be rational in an individualistic sense, but evidence must be available somewhere in the community. The testimony is defective if it does not connect you to a person, or persons, who do have evidence for the existence of electrons. This is what Wykstra refers to as a much more sensible way of construing the notion of needing evidence. EE requires that evidence is possessed by the individual, whereas SE requires that the evidence is possessed by the believer’s community.

SE gives us a much more plausible evidentialist objection to belief in God. The sensible evidentialist constraint will be that belief in God is only epistemically adequate if the religious community has sufficient evidence for the belief that God exists. The “interesting basicalist” will then be someone who claims that belief in God is not in need of evidence even in this sense; that belief in God is based upon our native faculties. Wykstra observes that even if belief in God is derived from some God-given faculty it may still be the case that belief in God is in need of evidence. Belief in electrons is in need of evidence because our native faculties do not give us access to them, but beliefs based upon our native faculties, such as testimony, are also sometimes in need of evidence in a rather different way. Wykstra draws attention to some of the insights of Thomas Reid concerning testimony:

When brought to maturity by proper culture … [reason] learns to suspect testimony in some case, and to disbelieve it in others … But still, to the end of life, she finds a necessity of borrowing light from testimony … And as, in many instances, Reason even in her maturity, borrows aid from testimony, so in others she mutually gives aid to it, and strengthens its authority. For, as we find good reason to reject testimony in some cases, so in others we find good reason to rely upon it with perfect security… (Wykstra 1989: 489)

According to Reid, we each have a natural tendency to believe testimony, however, over time we learn that not all testimony is reliable and we learn to find reasons to give some testimony greater weight and others much less. Although inferences are playing a role in forming testimonial belief, it is still testimony that gives support to the belief; inference only plays a refining role.

In light of varied religious beliefs and experiences, both across and within particular religious traditions, we must conclude that evidence is needed to discriminate between different religious beliefs. This does not mean that religious experience cannot ground belief in God. It may be that some religious faculty grounds the belief, but that the faculty is in need of refinement, just like testimony can be a basic source of knowledge, but still in need of refinement. This continues to draw on the teachings of the Christian tradition because although some Christians hold that we have access to God through our native faculties, they have been marred by sin, so it should not be surprising that we can err in our knowledge of God, or that our native faculties alone are not sufficient.

This sensible evidentialist objection should not really be called an objection; perhaps the sensible evidentialist problem would be better. That is because Wykstra is not urging the reader to give up belief in God, but rather to properly acknowledge the role that evidence can and does play in knowing God. This problem seems to have played some role in motivating the later work of Alvin Plantinga where he is attempting to set out a positive account of how religious beliefs could amount to knowledge, rather than simply responding to an objection.

8. References and Further Reading

  • Alston, William. “Religious Experience and Religious Belief”. In Nous 16 (1982): 3-12.
    • An early essay by one of the central proponents of reformed epistemology.
  • Alston, William. Perceiving God. Ithaca, NY: Cornell University Press, 1991.
    • An important work on the epistemology of religious experience.
  • Baker, Deane-Peter. Tayloring Reformed Epistemology. London: SCM Press, 2007.
    • An attempt to bring together the work of Charles Taylor and certain aspects of reformed epistemology. Includes a helpful description and critique of arguments for reformed epistemology.
  • Beilby, James. Epistemology as Theology. Burlington, VA: Ashgate Publishing, 2005.
    • A detailed account of Alvin Plantinga’s reformed epistemology.
  • DeRose, Keith. “Voodoo Epistemology” unpublished manuscript.
    • A well-known essay – despite being unpublished – that criticizes Alvin Plantinga’s reformed epistemology.
  • Feldman, Richard. “Plantinga on Exclusivism”. In Faith and Philosophy 20 (2003): 85-90.
    • A paper arguing that it cannot be rational to hold religious beliefs when one is aware of the widespread disagreement about religion.
  • Grigg, Richard. “Theism and Proper Basicality: A response to Plantinga”. In International Journal for Philosophy if Religion 14 (1983): 123-127.
    • An essay challenging the reformed epistemologist’s claim that there is a parity between perceptual belief and theistic beliefs.
  • Kenny, Anthony. Faith and Reason. New York: Columbia University Press, 1983.
    • Much of this book is on religious epistemology and it engages with reformed epistemology.
  • Mackie, J.L. The Miracle of Theism. New York: Oxford University Press, 1982.
    • An important book providing many arguments against theism.
  • Martin, Michael. Atheism: A Philosophical Justification. Philadelphia: Temple University Press, 1990.
    • This book presents numerous arguments in favour of atheism and against theism – including against reformed epistemology.
  • Plantinga, Alvin. God and Other Minds. Ithaca: Cornell University Press, 1967.
    • An early account of Plantinga’s parity argument which lays the foundation for reformed epistemology.
  • Plantinga, Alvin. Warrant and Proper Function. New York: Oxford University Press, 1993.
    • A discussion of proper function which also lays the foundation for Plantinga’s Warranted Christian Belief.
  • Plantinga, Alvin. Warranted Christian Belief. New York: Oxford University Press, 2000.
    • Arguably the most important work in reformed epistemology to date. Plantinga articulates and defends his version of the view at great length. It engages with many important debates in Philosophy of Religion.
  • Plantinga, Alvin. “A Defense of Religious Exclusivism” in Louis Pojman and Michael Rae (eds) Philosophy of Religion: An Anthology. Boston: Wadsworth, 2012.
    • Plantinga argues that it can be reasonable to believe that your religion is correct and that others are wrong.
  • Plantinga, Alvin and Nicholas Wolterstorff. Faith and Rationality. Notre Dame, Indiana: University of Notre Dame Press, 1983.
    • Contains many important early essays articulating and defending reformed epistemology.
  • Plantinga, A., Sudduth, M., Wykstra, S. and Zagzebski, L. “Warranted Christian Belief”. In Philosophical Books 43 (2002): 81-135.
    • A collection of essays critically engaging with Warranted Christian Belief, along with a reply from Alvin Plantinga.
  • Scott, Kyle. “Return of the Great Pumpkin”. In Religious Studies 50 (2014): 297-308.
    • A recent formulation of an objection to reformed epistemology along with a new response.
  • Sudduth, Michael. The Reformed Objection to Natural Theology. London: Ashgate, 2009.
    • Deals with the objections to natural theology that are typically posed by the reformed epistemologist.
  • Tomberlin, James and Peter van Inwagen (eds.). Alvin Plantinga. Dordrecht: D. Reidel, 1985.
    • A collection of essays examining the work of Alvin Plantinga, one of the central figures in reformed epistemology.
  • Wolterstorff, Nicholas. Reason within the Bounds of Religion. Grand Rapids, MI: Eerdmans, 1976.
    • An exploration of how his Christian faith ought to relate to his work as a scholar.
  • Wolterstorff, Nicholas. Lament for a Son. Grand Rapids, MI: Eerdmans, 1987.
    • Though not an academic book, some important points are made about reformed epistemology and religious epistemology in general.
  • Wolterstorff, Nicholas. Divine Discourse. Cambridge University Press, 1995.
    • A Philosophical exploration of claims that God speaks.
  • Wolterstorff, Nicholas. Justice: Rights and Wrongs. Princeton University Press, 2010.
    • Offers an account of rights and of justice. Engages significantly with Christian thought.
  • Wykstra, Stephen. “Toward a sensible evidentialism: on the notion of ‘needing evidence’.” In Philosophy of Religion, New York: Harcourt Brace Jovanovich (1989): 426-437.
    • An analysis of Plantinga’s critique of evidentialism.
  • Zagzebski, Linda (ed.). Rational Faith: Catholic Responses to Reformed Epistemology, Notre Dame: University of Notre Dame Press, 1993.
    • A response to reformed epistemology from various Catholic philosophers.

 

Author Information

Anthony Bolos
Email: ABolos@vcu.edu
Virginia Commonwealth University
U. S. A.

and

Kyle Scott
Email: k.scott@heythrop.ac.uk
Heythrop College
United Kingdom

Feminist Ethics and Narrative Ethics

A narrative approach to ethics focuses on how stories that are told, written, or otherwise expressed by individuals and groups help to define and structure our moral universe. Specifically, narrative ethicists take the practices of storytelling, listening, and bearing empathetic, careful witness to these stories to be central to understanding and evaluating not just the unique circumstances of particular lives, but the wider moral contexts within which we all exist. In telling stories, they suggest, we both create and reveal who we think we are as moral agents and as persons; in granting these stories uptake—that is, in giving them epistemic credibility—we help to mold and sustain the moral identities of others, as well as our own.  Thus, theorists engaged in narratively-based moral scholarship take stories to be foundational for how we view the world and our place in it, arguing that they are the means through which we can make ourselves morally intelligible to ourselves and to others.  At their best, narrative methodologies offer non-ideal, epistemically rich approaches—that are not grounded in strict, juridical principles—to a number of philosophical discourses, including those central to questions of morality, identity, and social justice.  At their most worrisome, they appear to be merely loosely-related notions about the constitutive roles of stories in moral theory and practice that do not easily lend themselves to rigorous moral justifications, epistemic explanations, or the guiding of action, raising concerns about the theoretical and practical soundness of the whole endeavor.

Table of Contents

  1. Introduction
  2. Feminist Ethics
  3. Narrative Theories and Methodologies
  4. Feminist Ethics and Narrative
  5. Some Criticisms of Narrative Approaches to Ethics
  6. Conclusion
  7. References and Further Reading

1. Introduction

Even as a relatively new set of moral discourses and practices, narrative ethics has made its presence known.  Among the areas within philosophy in which the influence of narrative has been particularly influential are biomedical ethics and feminist ethics.  While this entry will only minimally touch on the former, the focus on the latter requires some qualification:  While the themes, concerns, and ideas that connect feminist ethics and narrative theory are philosophically significant, this is not to suggest that all (or even most) of feminist ethics employs narrative methodologies, or that all (or most) feminist ethicists are narrativists.  In fact, in addressing the oppression of women and other disadvantaged individuals and groups, a number are focused on alternative, non-narrative methodologies (for example, multicultural feminists tend to focus on interconnected systems of oppression, which may or may not be grounded in oppressive narratives) while others (for example, certain liberal feminists who tend to focus on justice-related remedies) reject the personal turn altogether.  Thus, the connection between narrative ethics and feminist ethics as explored here ought not be viewed as global or as necessary, but as one that exists whenever the focus on the particular, on stories, and on phenomenologies within feminist ethics intersects with the conception of narratives as normatively constituting our moral universe.  Viewed in this light, their relationship is philosophically important in the sense of sharing an anti-totalizing, anti-hierarchical views of the practices of morality, as well as in the sense of emphasizing the necessity of greater inclusivity in moral discourses.  Indeed, to view feminist ethics through the lens of narrative, or to conceive of narrative ethics as an approach to feminist value theory is not to exhaust the claims, significance, or methodologies of either one—it is simply to examine overlapping aspects of both, and how they have, and continue, to shape each other.  This entry, then, will focus will be on the complicated relationship between feminist ethics and narrative ethics.  And while narrative ethics does not always neatly intersect with some of the concerns of feminist theorists, the relationship between feminist ethics and narrative ethics is nevertheless a rather dynamic one, combining the social, political, epistemic, and other insights of feminist theory with the fluid methodologies of narrative.  Moreover, although a number of feminist theorists have benefitted from, and contributed to, the various insights provided by narrative approaches to ethics, no single method or theory can definitively be called  “narrative feminist ethics.”

Thus, this entry will not endeavor to reduce the relationship between feminist ethics and narrative ethics to a single approach, but instead, will address the ongoing discourses between narrative approaches to ethics and feminist ethics, focusing on four specific issues:  (1).  What are some of the central concerns of feminist ethicists?  (2) What are narrative methodologies, and how do they pertain to ethics, and specifically, to feminist ethics?  (3) How have the theorists engaged in feminist ethics turned to narrative, and which aspects of narrative seem to be most useful to their projects?  (4) What are some general criticisms of narrative as an approach to ethics, broadly construed?

2. Feminist Ethics

Although the main purpose of this entry is not an exploration of the many nuances of the approaches to feminist ethics or the work of feminist ethicists, it is important to note how feminist ethics differs from the more “traditional” ethical theories, and importantly, how this difference makes feminist ethics responsive to narrative approaches and methodologies. To a large extent, feminist ethical theory can be understood as both a response to, and a movement against, a historical tradition of more abstract, universalist, ethical theories such as utilitarianism, deontology, and in certain respects, contractarianism and virtue theory, which tend to view the moral agent either as an autonomous, rational actor, deliberating out of a calculus of utility or duty, or else as an often disembodied and decontextualized ideal decision-maker, unburdened by the non-ideal constraints of luck (moral and otherwise), circumstance, or capability (Nagel 1979; Brennan 1999; Nussbaum 2000).  Specifically, feminist ethicists contend that this top-down, juridical, principlist theorizing has largely neglected the centrality of physical, social, and psychological situatedness, power differentials, and, importantly, the voices of women whose lived experiences have simply not been part of any ongoing moral debates (Young 2005; Jaggar 1992; Walker 1997; Lindemann Nelson 2001; Held 1990; Tessman 2005). As Alison Jaggar argues, traditional ethics emphasize male-centered issues of the public and the abstract while dismissing the private and the situated.  As a result, women, and “women’s issues” that have to do with care, interdependent relationships, community, partiality, and the emotions, are de-centered, and relegated to the margins of serious intellectual (and specifically philosophical) inquiry (Tong and Williams 2014; Jaggar 1992).  While there is a significant number of subgroups of feminists—traditionally, including care ethicists, Marxist feminist, liberal feminists, radical feminists, and ecofeminists, and lately, divided into a greater variety of feminisms, including analytic feminism, continental feminism, radical lesbian separatist feminism, pragmatist feminism, psychoanalytic feminism, and all the intersections among them—the intent of feminist theory has been, and remains, the elimination of group and individual oppressions, and especially the silencing oppression of women, both in philosophical discourse and in the wider world (Tuana 2011).  As Brennan argued, “feminist ethical theories [are] those ethical theories which share two central aims: (a) to achieve a theoretical understanding of women’s oppression with the purpose of providing a route to ending women’s oppression and (b) to develop an account of morality which is based on women’s moral experience(s)” (Brennan 860, 1999, citing  Jaggar 1991).

While this entry does not address the many varieties or the latest developments within feminist ethics, it is important to note its general, and persistent, commitment to for the rejection of, among other things, the sort of universalizable, uniform, acontextual “view from nowhere” that characterized much of ethical theory.  As noted earlier, feminist ethics has generally pointed to the lack of serious philosophical attention to the aspects of human life where women (and other minorities) tend to predominate, thus leading to a deficit of inclusion of these actors who were subsequently made invisible to anthropocentric theory.  In taking seriously, and including, the contexts, relationships, and commitments of women’s (and many marginal others’) experiences in its theorizing, feminist ethics does not deprive these individuals and groups of their moral agency.   In this way, feminist ethics opens up the spaces of reasons within moral theory to marginalized others by, on the one hand, affirming the necessity of socially inclusive moral work, and on the other, by challenging the socially (and otherwise) excluding practices, boundaries, and limitations of its current discourses.  Finally, although not strictly a part of this discussion, it is important to note that liberal feminism, and liberal feminists, like Susan Moller Okin, represent an important exception to the more particularist, subjective approaches to women’s freedom noted earlier, focusing instead on the need for women’s personal and political autonomy, promoted by the liberal state, that enable their flourishing as persons, and fighting for democratic self-determination denied to women by social and political patriarchies (Okin 1989).

What is mostly dismissed or neglected by traditional moral theorists is any engagement with non-ideal actors in non-ideal environments.  Often, the default “autonomous moral agent” within moral philosophy  is an otherwise unencumbered, abstract decision-maker, understood to be a man, coming to a decision that is not otherwise burdened by the messy contextuality of an actual lived life (Tong and Williams 2014; Jaggar 1992; Brennan 1999).  The result was not only a simplification of what it might mean to be a situated agent in non-ideal circumstances, but also the wholesale absence of agents who were not male, not unencumbered, and certainly not abstract.  In other words, those left out of the moral calculus—indeed, out of the philosophical moral imaginaries—were women, people of color, LGBTQ communities, economically underprivileged individuals, and many others.  Because many of these non-standard agents are engaged with the world in ways not considered by those relying on abstract agent models in their ethical analyses, and because they are instead participants in the interdependent moral practices that define them in terms of their relationships with others, they are viewed by a great majority of traditional moral theorists as somehow beyond the scope of philosophical discourse (Held 1990; Tong 2009).  Apart from neglecting the fact of men’s own situatedness and embeddedness in the particular circumstances of their lives, the exclusion of entire categories of individuals not only deprived these populations of a voice in philosophical debate, but also removed their experiences from the scope of possible normative discourse altogether. More specifically, the voices, and thus the moral experiences, of women and minorities were effectively silenced as reliable narrators of not only the moral significance of their experiences, but also of what moral theory and practice ought to take into account as its proper subject matter.

In addition to impoverishing and narrowing the idea of what moral theory is more generally, this kind of silencing has had uniquely burdensome costs for the silenced:  As the feminist political theorist Iris Marion Young noted, those whose voices and whose presence have been historically missing from public discourses are severely challenged in receiving any kind of uptake of their view even once they attempt to engage (Young 1997; McAfee 2014). In the process of democratic deliberation, for example, those who are not habituated into participating in the overly formal, abstract, and juridical moral and sociopolitical discourses would be continuously marginalized, and, in the end, dismissed.  What Young argued ought to be an alternative way of engaging those on the periphery is a kind of a “communicative democracy,” allowing for a number of different communicative styles (including narrative, rhetoric, and storytelling), perspectives, and voices (McAfee 2014).

Thus, understood very broadly, feminist ethics is a response to this epistemic, moral, and sociopolitical silencing born of exclusion, and to the oppression that it underwrites. As Samantha Brennan notes, “[f]eminist ethics seeks to overcome the limits of narrow, male-centered ethics by constructing moral theories which can make sense of the experiences of women as moral agents…feminist ethics has become associated with an ethics of lived, concrete experiences which takes most seriously women’s experiences of morality” (Brennan 1999, 861). Indeed, feminist ethicists, like Margaret Urban Walker, have argued against the impartial universality of “juridical,” or top-down, ethical methodologies that reduce moral reason to rigid, acontextual deductions, and favor more situated, “expressive-collaborative” approaches to morality that expand, rather than restrict, both the spaces of moral reasons as well as the variety of moral agents (Walker 1997).

What feminist philosophers accomplish, therefore, is the broadening and deepening of what it means to be engaged in moral philosophy by introducing the epistemically and morally rich stories of what it is like to be a non-ideal agent in a non-ideal world.  It is this turn toward including, confronting, and challenging the oppressions of women (and other oppressed and often silenced populations), that serves as the beginning of the intersection between narrative and feminist ethics.  And because like much of feminist moral theory, narrative approaches to ethics emphasize the contextuality, situatedness, and the shared nature of public and private life as central to moral reasoning, some leading feminist philosophers have offered a number of varying approaches to philosophical ethics that can all nevertheless be called “narrative” in significant ways.

3. Narrative Theories and Methodologies

Generally speaking, there is not a single theory of narrative ethics, nor is there a single correct way to engage in narrative analysis.  However, there are a number of views and practices that have a family resemblance, and can be construed as a part of a larger, more amorphous field of narrative ethics.  One such view about how narrative is to be understood as a part of moral theory is offered by Kathryn Montgomery Hunter.  Hunter argues that “[i]n using the word ‘narrative’ somewhat interchangeably with ‘story’ I mean to designate a more or less coherent written, spoken, or (by extension) enacted account of occurrences, whether historical or fictional” (Hunter 1996, 306).

There are many ways to define, and engage in, a narrative approach to ethics.  By a “narrative approach,” I mean a focus on the significance of context, situatedness, and, importantly, the communication of the stories people tell about themselves and others in trying to make themselves, others, and, more broadly, their world, mutually intelligible.  Narrative ethicists often criticize what they consider to be a preoccupation with impartiality distance, and universalizability at the expense of personal relationships among more traditional juridical moral theories (Walker 1997; Lindemann Nelson 2001; Rimmon-Kenan 2002).  What is missing, they suggest, is not merely the exclusion of so many from juridical moral discourses, but, importantly, the warrant for why moral actors would desire, and be motivated by, something like a good will (or a utilitarian-based outcome, or a rights-based justification) as a part of  a meaningful life.  Indeed, they argue, given the requirements of juridical moral thought, we are left wondering what there is to admire about such a life, why such a life is worth having, and why disinterested detachment from everything and everything one cares about – that is, detachment from all that makes the moral life not just worthwhile but possible – is the sole path to robust moral agency.  Although duties and laws might very well be a part of moral work, the “ought” of morality cannot be grounded primarily in bare, unyielding principles.

In response, narrative approaches to moral theory and practice have been put forth by a number of philosophers (especially those engaged in normative ethics and applied ethics, such as medical ethics), literary scholars, and psychologists, including Alasdair MacIntyre, Charles Taylor, Paul Ricoeur, Paul John Eakin, Hilde Lindemann Nelson, Margaret Urban Walker, Martha Nussbaum, Kathryn Montgomery Hunter, and Jerome Bruner, among others. Indeed, the philosopher Marya Schechtman has argued that narratives are not only essential to understanding what we do, but, indeed, to who we are by suggesting that only those who “weave stories of their lives” are, strictly speaking, persons.  This is so, she suggests, because one’s narrative is precisely what constitutes—or, as she argues, characterizes—one’s personal identity (Schechtman 1996).  Generally, narrative theorists take the personal story, or the first-person narrative, to not only be descriptively informative, but also normatively vital to connecting a particular life with the rest of a moral community (or communities), making the story, and the storyteller, both intelligible and open to normative analysis.  In other words, theorists who use a narrative approach to ethics take the process of telling and hearing the stories of our lives to be doing something morally significant.  For example, feminist philosopher Hilde Lindemann offers the following summary of some possible roles of stories in moral reasoning:

Narrativists have claimed, among other things, that stories of one kind or another are required: (1) to teach us our duties, (2) to guide morally good action, (3) to motivate morally good action, (4) to justify action on moral grounds, (5) to cultivate our moral sensibilities, (6) to enhance our moral perception, (7) to make actions of persons morally intelligible, and (8) to reinvent ourselves as better persons (Nelson 2001, 36).

Thus, narratives can differ teleologically.  They can also be judged to be good and bad, desirable and undesirable, truthful and false.  Indeed, instead of providing the sort of insight into ourselves that might be constructive and action-guiding, they can encourage dishonesty, cowardice, or can serve to indulge our fantasies in generally unhealthy, or even destructive, ways. Narratives can be “master narratives” that tell us where and how we are socially situated with respect to our duties, claims, and expectations.  One can also resist harmful master narratives through a counterstory, whose purpose it is to “root out the master narratives in the tissue of stories that constitute an oppressive identity and replace them with stories that depict the person as morally worthy” (Lindemann 2001, 150). Moreover, one can resist a master narrative through a humorous re-casting of that narrative – a king with no clothes (power), Victor/Victoria (sexuality), and so on – that serves to expose the “master narrative” as unreliable, or at least of doubtful validity. And, of course, the master narratives themselves can differ:  while they can oppress, they can also inform, (re)align, and guide. Counterstories, too, can be destructive as well as reparative. What matters is acquiring the ability (and desire) to listen or read closely enough, with sufficient attention and discernment, to tell the difference (Lindemann 2001).  Morality, in short, is not solely within the purview of a judge who possesses the necessary moral epistemology and pronounces on a given act as “warranted” or “unwarranted,” but is something that we do together: it is a socially embodied medium of understanding and adjustment in which people engage in practices of allotting, assuming, or deflecting responsibilities of various kinds (Walker 1997).  These practices create a vocabulary and resources for moral deliberation that give us recognized and socially shared ways of deciding what is good or right to do.

Since narrative approaches to ethics are not a singular, monolithic whole, the understandings and practices of what it might mean to engage in moral analysis narratively does indeed vary.  Narratives can be read, heard or viewed through the mediums of film, literature, or through the oral traditions of storytelling, thus expanding one’s emotional, social, and intellectual vocabulary and perception. In this way, we become not merely better informed about being otherwise, but better equipped in addressing morally complex and difficult situations in the real world  (Lindemann 2001; Nussbaum 1990).  These narrative techniques can be can be reified by substituting a “master” model of moral reasoning (say, the Enlightenment model of detached objectivity and rationality) with the kind of normativity that is action-guiding to a particular narrative community that wishes to find justification for, and thus make moral sense of, its way of life (Lindemann 2001; MacIntyre 1984). They can also serve as methods of clarification of confusing or contradictory moral reasoning when compared to each other. In trying to work through some particularly difficult moral dilemmas, narratives can help us to see where seemingly divergent viewpoints can possibly move closer together, when they cannot, and why, without resorting to ill-fated attempts to (re)order principles and (re)interpret laws.  In short, a narrative approach to doing ethics takes its cues from the stories themselves, as they are told, heard, and (mis)understood, and although there are a number of approaches and methodologies, they tend to center around questions of who the teller is, what the teller might mean, who the intended (and unintended) audience might be, what is the effect of the story, and (perhaps less frequently) what constitutes a good story – and what might be meant, in this case, by “good.”  Writing from a narrativist medical ethics perspective, Joan McCarthy suggests that some of the central tenets of narrative approaches to moral issues can be understood as the following:

  1. Every moral situation is unique and unrepeatable and its meaning cannot be fully captured by appealing to law like universal principles.
  1. …[A]ny decision or course of action is justified in terms of its fit with the individual life story or stories…
  1. The objective of the task of justification in 2 is not necessarily to unify moral beliefs and commitments, but is to open up dialogue, challenge received views and norms, and explore tensions between individual and shared meanings. (McCarthy 2003)

Thus, a narrativist account of moral problems, dilemmas, and general questions of moral judgment, takes seriously the multitudes of individual lives, and thus the multitudes of voices and interpretations of moral situations.  What matters, then, is not so much a reduction of moral positions to a commonly-held single perspective, but an opening up of a space for reasons and dialogues with equally morally worthy others, thereby expanding the possibility of a shared, rather than a unitary and monolithic, moral universe.

One way to charitably interpret this narrative turn in ethics is to take seriously the proposition that stories simply provide the sort of flexibility of understanding and variability of perspective that deep and “thick” moral work requires.  It makes possible a way to engage in moral negotiations by reminding the participants to take into account how they got to the present point, what the present circumstances are, as well what they ought to do in the future.  At their best, narrative approaches to ethics welcome voices that, as Young noted, are differently situated, possessing quite radically divergent views of where they fit within the moral and sociopolitical discourses and debates.  They remind us that different participants carry the burdens of different histories, epistemologies, and moralities. In the end, the narrative collaborative methodologies see stories as not merely ways to decide among competing principles, but as self-contained, and context-rich, reasons to revise moral understandings, to negotiate solutions, and to continue seeking the ever-elusive common ground (Walker 1997).

4. Feminist Ethics and Narrative

Given the narrative emphasis on multivocality, shared discourse, and the moral significance of individual voices, it is perhaps not entirely surprising that feminist philosophers have both employed and expanded the idea of narrative within feminist philosophy.  For example, Margaret Urban Walker, in “Moral Particularity,” has argued that one of the characteristics that make an agent a distinctly moral one is her desire to define herself as the protagonist of a coherent narrative (Walker 2003).  The “moral persona” that emerges out of such narrative coherence, she claims elsewhere, is defined by her commitments to individuals, institutions, and values.  It is this desire to self-define as a protagonist of a largely coherent narrative that makes one as a moral agent (Walker 1989, 177).  Walker later expanded her views to include the narrative notions of collaboration and negotiation into moral work.  Perhaps as a way of challenging what she calls the “theoretical-juridical model” of ethics as exemplified by  the more traditional top-down theories, her “expressive-collaborative” approach to morality turns on its head both the priorities and its presuppositions of what it means to be engaged in moral practice.  As a priority, the expressive-collaborative approach tends to view the importance of moral work not as necessarily the juridical determination of “right” and wrong” based on a set of deduced unyielding norms or laws, but more as a way to negotiate and narrate our way in the complex and imperfect social, physical, and psychological realities of being human: The “expressive-collaborative model”  encourages us to view “an investigation of morality as a socially embodied medium of mutual understanding and negotiation between people over their responsibility for things open to human care and response” (Walker 1997, 173).  The distinction between this approach and the non-narrative juridical one is that while the latter emphasizes the uniformity of what is required, forbidden, or permitted in a given situation for all similarly-placed agents,  the expressive-collaborative model prioritizes moral competence as strong moral self-definition, or, as Walker has argued, “the ability of morally developed persons to install and observe precedents for themselves which are both distinctive of them and binding upon them morally” (Walker 2003).  In other words, her argument explicitly makes the point that the work of morality has to do with accountability and responsibility – and thus moral reliability, requiring a certain integrity in one’s relationships, sense of identity, and values. To be accountable is, to some extent, to be viewed as accountable by others, and this means that our actions have to tell a coherent story at least to the extent that they are reasonably predictable by those who are affected by them in the sorts of situations that matter morally.  In the end, moral accountability is a narrative practice of making ourselves both internally and externally coherent, and in so becoming, weave ourselves into the fabric of a moral universe (Walker 1989).

Other feminist scholars have also turned to narrative as a way to engage with some of the central concerns within feminist ethical theory.  In considering how personal identities structure our various moral discourses and concepts, Hilde Lindemann claims that these identities are “complex narrative constructions consisting of a fluid interaction of the many stories and fragments of stories surrounding the things that seem most important, from one’s own point of view and the point of view of others, about a person over time” (Nelson 2001, 20).  She argues that not only are identities, and thus one’s moral standing in society, narratively constituted, she notes, but they can be narratively damaged by oppression and oppressive practices.  Indeed, the moral damage of oppressive “master narratives”—destructive especially to those who are already socially subordinated and disempowered—must be counteracted with powerful counter-narratives that just might repair these broken identities, securing individual (and sometimes group) moral agency.  The two kinds of moral damage that can impact the cohesion of one’s identity—one, depriving one of important social goods, and the other, of self-respect—can be repaired by, and through, counterstories, which narratively resist, challenge, and overcome the damaging master narratives that are so inflicted by the powerful on the vulnerable (Lindemann 2011).  For example, women whose moral agency is compromised because they choose childlessness against a more general pronatalist narrative can offer stories of womanhood as personhood without essentializing the women-as-mother.  As noted earlier, these counternarratives can take many forms, but their purpose remains consistent:  to both expand normative spaces to include those previously excluded, and to admit, and in fact encourage, the use of narrative as a legitimate practice of engagement within the broader moral and sociopolitical discourses.

Of course, Walker and Lindemann  are not the only feminist theorists to turn to narrative.  Because feminist theorists are generally concerned with addressing various kinds of oppressions—and especially the oppression of women—they have often construed personal stories as fruitful ways of theorizing morally dilemmatic situations.  These accounts serve a number of goals, including clarifying the harms of oppression, explaining the personal costs of cruel, myopic, or marginalizing moral reasoning, re-orienting the purely juridical and theoretical toward the non-ideal, and, among other things, motivating the development of moral thinking in ways that are inclusive of contextualities, situatedness, and burdened lives.  Vivid, empathy-producing examples that tend to engage the moral imagination are often used by feminist scholars to focus on specific problems in order to show—and not to simply argue—that issues of sexism, oppression, gross power differentials, exclusion, and domination must be recognized and addressed both theoretically and practically.  For example, Sandra Bartky, as a part of her analysis of objectification, offers a story of harassment, catcalls and whistles, noting that her previously unremarkable walk was now a source of identity-threatening humiliation and brutal, othering objectification (Bartky 1990; 1979).    Moreover, Susan Brison, as a part of her examination of violence, identity, and the moral work of bearing witness, shares the very personal trauma of her brutal rape and assault.  As Anita Superson notes, “Brison argues that the experience of rape should be of interest to philosophers because it raises many philosophical issues, including the metaphysical issue of the disintegration of the self, the epistemological issue of the victim’s skepticism about everyone and everything, as well as the obvious legal, moral, and political issues relating to what it is like to be a victim of rape, why rape occurs and is so prevalent in our society, what its meaning is, and so on” (Superson 2009).  Moreover, Susan Estrich employed her own story not just of rape, but of a right for her credibility as a reliable narrator of her experiences as a crime victim to police who did not take her claim to be a serious one (Superson 2009).  Through this reliance on a personal narrative of trauma and victimization, she addresses not only the broader challenge of confronting the presuppositions and prejudice inherent in American rape law, but also makes a case, through her personal narrative, for alternative, less abstract and rigid constructions of the notions of force and consent.

Turning to a different aspect of narrative—namely, fiction—the philosopher Martha Nussbaum argues that the narrativity of literature provides a deep and necessary source of moral knowledge that not only more sharply attunes people to the various sources of morality, but also to themselves as sensing moral beings who enter into relationships of mutual responsibilities and obligations with each other (Nussbaum 1990).  Finally, Seyla Benhabib has noted that narratives are not only the central constituting elements of a self, but that  “[w]e are born into webs of interlocution or into webs of narrative-from the familial and gender narratives to the linguistic one to the macronarrative of one’s collective identity. We become who we are by learning to be a conversation partner in these narratives. Although we do not choose the webs in whose nets we are initially caught or select those with whom we wish to converse, our agency consists in our capacity to weave out of those narratives and fragments of narratives a life story that makes sense for us, as unique individual selves” (Benhabib 1999, 344).

Thus, the process of challenging, re-defining, and finally re-making moral theory and practice that is so central to the project of feminist ethics can, with the help of narrative methodologies, go far toward addressing women’s oppressions, as well as the oppressions of numerous excluded others.  By telling their stories—by grounding moral theorizing in personal narratives rather than in purely idealized contexts and agents—feminists scholars are not only able to motivate a deeper understanding of ethical dilemmas, but also advocate for practical changes in the structures of marginalizing social practices by creating a more inclusive space of reasons within which to negotiate our moral understandings.

5. Some Criticisms of Narrative Approaches to Ethics

Even though embraced by a not insignificant number of feminist ethicists, narrative approaches to ethics, whether feminist or otherwise, are not without their critics.  While these criticisms are diverse and multifaceted, many of them converge on the worries about narrative’s lack of moral grounding, epistemic justification, and normative guidance.  Some concerns stem from a reliance on context, perspective, and circumstances of specific stories, which, for some, drift too close to relativism.  Others worry about the dependence on testimony and storytelling as a basis of moral theory.  A number of theorists also wonder about the theoretical and practical value of a narrative, contextualized approach for moral theory, broadly construed.  Finally, some claim that a narrative approach to understanding one’s place in the moral universe is not only misguided, but unnecessary and not reflective of what matters to us morally as human beings.

First, in addressing worries about relativism, the turn of feminist ethics toward narrative and experiential pluralism might re-make moral theory merely into an account of “historically specific moral practices and traditions” (Jaggar 1991, 93).  Alison Jaggar further notes that while feminist ethics is “incompatible with any form of moral relativism that condones the subordination of women or the devaluation of their moral experience…[i]t is neutral, however, between the plural and local understanding of ethics, on the one hand, and then ideal of a universal morality, on the other” (Brennan 1999, 862, citing Jaggar 1991, 94).  Thus it would seem that worry about the slide of feminist narrative-based theory into moral relativism is at least prima facie warranted:  Ought feminist theorists relying on narratives focus on the local and the contextualized, rather than on the abstract and universalizing, if so doing offers an expansion of new “political agendas” (Shrage 1994) while at the same time leading practitioner to accept practices and narratives that might contribute to other kinds of oppressions (Brennan 1999)?  Perhaps if identity-and-moral-community-defining stories are to have any kind of moral grounding that are both useful and reasonably defensible, then Susan Sherwin’s suggestion that it is the revisable and process-dependent “community standards” that might offer something beyond a fully relativistic and situational ethics is one way out of the worries about relativism (Brennan 1999; Sherwin 1992).

Second, Diana Tietjens Meyers, concerned about the reliability of testimony, suggests that narrative theory, instead of simply looking to storytelling as its sources of normativity, must prove its credibility as an account of morality by insisting on a particular skill set of the storyteller.  She argues:  To ensure respect for the diversity of morally decent lives, narrativity theory must explicate the credibility of self-narratives in terms of this repertoire of skills.  Self-narratives are not all equally valid, revealing, and conducive to flourishing, but there is no property internal to self-narratives nor any interpersonal test that can rank them. The best gauge of a self-narrative’s credibility, then, is the narrator’s overall degree of mastery of the self-discovery, self-definition, and self-direction skill repertoire and the extent to which the narrator made use of this competency in constructing a particular self-narrative” (Meyers 2004,303).  Meyers claims that if narratives are simply taken at face value, we might be left with “all sorts of fictions—fairy tales, negative utopias, science fiction, romances, and horror stories—as well as autobiographical narratives” (Meyers 2004, 303).  Simply because a story is good or interesting, Meyers notes, it does not guarantee that it will be anything but an exercise in wishful fiction or a flight of fancy.  In order to properly address this possibility, one must acquire particular skills—introspection, volition, nurturing, communication, listening, and memory, among others—that allow one to recall relevant experiences, to imagine feasible options, and so on (Meyers  2004).  Indeed, Meyers insists that “[t]o curb overactive imaginations, to overcome isolating silence, and to secure the credibility of self-narratives, the competency that keeps people attuned to themselves and alive to life’s possibilities must underwrite the processes of self-narrating” (Meyers 2004,303).  Without this kind of rigorous self-discipline, a narrative approach to morality seems at best less than fully credible, and at worst, a methodologically compromised enterprise that confuses the interesting and the exciting with the epistemically important and the morally compelling.

Third, another kind of critique of narrative is offered by the philosopher and bioethicist Tom Tomlinson, focused on the worry about whether a narrative approach to ethics brings something distinct to moral theory that other, more traditional, approaches do not (Tomlinson 1997).  Tomlinson argues that even though narrative might be methodologically important to the development of ethical reasoning, it does not offer “a mode of ethical justification that is independent from or superior to appeals to moral principles” (Tomlinson 1997, 132).  On his view, narrative does not serve the sort of “central epistemic function in the discovery, justification, or application of ethical knowledge” that its supporters take it to be serving (Tomlinson 1997, 124).   Instead, he argues that a focus on stories does not go far enough – or, indeed, any distance at all – toward enriching our moral epistemology. If narrative sets itself against the overstructured and sterile methodology of juridical thinking, then, Tomlinson claims, we ought to expect to find something morally valuable that is unavailable to us through principles alone. However, this does not seem to be the case: First, if one takes the kind of narrative approach that Martha Nussbaum has proposed—whereby engaging with certain kinds of literature allows for the development of a more nuanced, and empathetic, view of moral discourse (Nussbaum 1990)—and reads a novel in order to broaden one’s moral imagination, one is missing the actual encounter with a living person, and is thus epistemically and morally limited by the four corners of the text. Whatever moral “truth” is made available by the story, it seems limited situationally to the characters within that story, and does very little to speak to those who do not also share the world in which a particular moral lesson unfolds.  And even if one were to set aside literary narrative and enter into a conversation with other people, the sort of particular knowledge one might derive through these interactions would not yield any moral knowledge that is generalizeable—that translates from one story, or from one storyteller, to another. At best, Tomlinson suggest, “novels and stories become…vivid illustrations of knowledge verified through other means” (Tomlinson 1997,125).

On Tomlinson’s account then, narrative does not appear to have much to contribute toward assisting the ethical discourse about aligning, or at least making less attenuated, the relationship between moral principles and lived experiences.  Indeed, Tomlinson sees no clear way to distinguish how a uniquely narrative approach helps with addressing ethical dilemmas from other methodologies.  For example, in a case where one is torn between disclosing or withholding potentially devastating news, a narrative theorist might require a consideration of how much truth to tell, how to tell it, how one will hear what is said, who is doing the speaking, and so on. However, Tomlinson suggests that aside from the vagueness of the narrative criteria itself, what it might mean to “interpret” this information is unclear:  “Any social system of reasoned reflection involves a ‘communal dialogue’ of ‘give and take,’ including those deliberatively rooted in principles…The failure to provide any more precise account of the nature and role of ‘interpretation’ is a symptom of the tendency to wave it and ‘narrative’ as banners that fly over everything bright and beautiful being ignored by those crude and insensitive principles” (Tomlinson 1997, 127).

Moreover, Tomlinson rejects what he views as the tendency among proponents of narrative ethics to conflate the descriptive claim that one’s life is best understood as a narrative with the normative claim that one’s choices – and especially one’s moral choices – ought to be judged according to their coherence with a given life narrative. First, we do not, he claims, live a life that can be forced into coherence by a storyline – or by anything else: “we don’t live out a narrative, we create one by living a life” (Tomlinson 1997, 130).  Second, even if we were to take seriously the narrative we create by “living a life,” “the [moral] question of how best to live out ‘that’ unity is not answered by the notion of narrative unity. It’s answered by appeal to extranarrative ideals that elevate some kinds of narrative over others” (Tomlinson 1997, 130).  And since these ideals can be whatever one desires them to be, the resulting coherence loses any meaningful normative force. Unless one subscribes to one “extranarrative” ideal – or, indeed, to one principle – over another, the standard of narrative coherence seems to neither add anything to principlist analysis, nor offer an epistemically independent criteria of ethical reasoning, explanation, or justification.

While Tomlinson’s arguments focus on the claim that narratives do not offer any ethically or epistemically satisfying criteria that we could use in making moral choices, another kind of criticism, offered by John Arras, centers on the moral incompleteness of narrative as moral theory.  Although he takes a somewhat more conciliatory, although still critical, view, his dissatisfaction with narrative as a method for doing ethics is grounded in his suspicion of narrative as a means of grounding moral justification—of finding the relationship “between the telling of a story and the establishment of a warrant for believing in the moral adequacy or excellence of a particular action, policy or character.”  Having examined what he takes to be three different approaches to narrative—“as an essential element of any and all ethical analyses,” as an ahistorical rejection of the Enlightenment project, and as a postmodern attempt to substitute narrative “for the entire enterprise of moral justification”—he concludes that, while narrative seems to be an important part of ethical analysis, its ability to completely replace principles and ethical theory seems doubtful at best if what one seeks is moral justification for actions (Arras 1997, 79-85).  Arras’s view, therefore, is that narrative seems to be merely supplementary to principles, and, in the end, is no threat to their moral primacy.

Finally, Galen Strawson, in “Against Narrativity,” argues that a narrative approach (to morality, to identity, and so on) is not only presumptively false from a folk-psychological, or common-sense perspective, but is also descriptively vague and normatively unmoored.  He claims that not only does he not see himself or his life in narrative terms, but that he resents the idea of such a practice altogether.  In response to the urging of (feminist and other) narrative theorists to engage in moral work through narrative, Strawson wonders, “Why on earth, in the midst of the beauty of being, it should be thought to be important to do this” (Strawson 2004, 436).  Indeed, he noted there are deeply non-Narrative people and there are good ways to live that are deeply non-Narrative” (Eakin 2006, 180-187).  Moral claims about oneself, about others, or about the world more broadly, Strawson insists, do not require the reliance on stories, or on how these stories relate to one’s present and future agency and shared moral understandings.

6. Conclusion

It can be said with some certainty that narrative approaches to ethics are not without considerable controversies and passionate critiques.  It also seems clear that there are significant and challenging insights offered by narrative ethicists—a number of which have been theorized, defended, and expanded upon by feminist ethicists.  Indeed, it seems that feminist ethics and narrative approaches to normativity do indeed share a number of concerns, goals, and motivations that offer powerful counterstories to the largely principlist, abstract, and universalizing practices of traditional moral theory.  But shared worries and a desire for a more multivocal and collaborative moral discourse do not presuppose, nor require, the same methodologies, and there are some clear and powerful points of disagreement both within feminist philosophy about the role of narrative in ethical theory, as well as among narrativists themselves about what kinds of narratives ought to count as properly normative and adequately action-guiding.  Because there is not a single approach to feminist ethics, and certainly no single way of engaging in narrative analysis, it is quite difficult to make any tidy generalizations, either about the theories themselves, or about their complicated relationship.  Yet perhaps this is exactly the point: theorizing that tends to move away from such generalization in its own methodologies unsurprisingly escapes any attempts at totalizing definitions, in the process changing and restructuring the spaces and scope of moral discourse.

7. References and Further Reading

  • Arras, J.  “Nice Story, But So What?” In H. L, Nelson (ed.).  Stories and Their Limits: Narrative Approaches to   Bioethics.  New York: Routledge, 1997.
  • Bal, M.  Narratology: Introduction to the Theory of Narrative. Toronto: University of Toronto Press, 1997.
  • Bartky, S.  “On Psychological Oppression.” In S L. Bartky (ed.).  Femininity and Domination: Studies in the      Phenomenology of Oppression.  New York: Routledge, 1990.  Reprinted from Philosophy and Women (Wadsworth Publishing, 1979).
  • Becker, L. C. “Impartiality and Ethical Theory.” Ethics 101.4 (1991): 698-700.
  • Benhabib, S.  ‘Sexual Difference and Collective Identities: The New Global Constellation’. Signs: Journal of   Women in Culture and Society. 24.2 (1999):  335-361.
  • Benson, P.  “Free Agency and Self-Worth.” Journal of Philosophy 91.12 (1994): 650-668.
  • Benson, P. “Feminist Second Thoughts about Free Agency.” Hypatia 5 (1990): 47-64.
  • Benson, P. “Feminist Intuitions and the Normative Substance of Autonomy.” Personal Autonomy: New Essays on Personal Autonomy and its Role in Contemporary Philosophy, Ed. James Stacey Taylor. Cambridge:        Cambridge University Press, 2004.
  • Brennan, S.  “Recent Work in Feminist Ethics.” Ethics 109.4 (1999): 858-893.
  • Brison, S. J..  “Surviving Sexual Violence: A Philosophical Perspective.” In S.G. French, W. Teays, and L.    M. Purdy (eds.) Violence Against Women: Philosophical Perspectives.  Ithaca, New York: Cornell University          Press, 1998.
  • Charon, R.  “Narrative Medicine: Attention, Representation, Affiliation.” Narrative 13.3 (2005): 261-270.
  • Charon, R.  and Montello, M.  Stories Matter: The Role of Narrative in Medical Ethics. New York: Brunner- Routledge, 2002.
  • Christman, J.  “Narrative Unity as a Condition of Personhood.” Metaphilosophy 35.5 (2004): 695–713.
  • Crossley, M. L. “Narrative Psychology, Trauma and the Study of Self/Identity.” Theory and Psychology 10.4   (2000): 527–546.
  • Damasio, A. R. The Feeling of What Happens: Body and Emotion in the Making of Consciousness. New York:       Harcourt Brace, 1999.
  • Dennison, A.  Uncertain Journey: A Woman’s Experience Of Living With Cancer. Newmill: Patten Press, 1996.
  • DesAutels, P. and Walker, M. U., eds. Moral Psychology: Feminist Ethics and Social Theory. Lanham MD:   Rowman and Littlefield Publishers, Inc., 2004.
  • Eakin, P. J.  “Narrative Identity and Narrative Imperialism: A Response to Galen Strawson and James Phelan.” Narrative 14.2 (2006): 180-187.
  • Eakin, P. J. How Our Lives Become Stories: Making Selves. Ithaca: Cornell University Press, 1999.
  • Frank, A. W. “Just Listening: Narrative and Deep Illness.” Families, Systems and Health 16.3 (1998): 197–212.
  • Frank, A. W. The Wounded Storyteller: Body, Illness, and Ethics. Chicago: University of Chicago Press, 1997.
  • Frank, A. W. “Asking the Right Question about Pain: Narrative and Phronesis.” Literature and Medicine 23.2 (2004): 209-225.
  • Held, V. “Feminist Transformations of Moral Theory.” Philosophy and Phenomenological Research, Fall Supplement, 1990.
  • Held, V. Feminist Morality: Tranforming Culture, Society, and Politics. Chicago:  University of Chicago Press, 1998.
  • Held, V. “Feminist Reconceptualizations in Ethics.” In .  J. Kourany, (ed.). Philosophy in a Feminist Voice: Critiques and Reconstructions.  Princeton: Princeton University Press, 1999.
  • Hooker, B.  and Little, M. O.  Moral Particularism. Oxford: Oxford University Press, 2000.
  • Hunter, K.  M.  Doctors’ Stories: The Narrative Structure of Medical Knowledge.  Princeton, New Jersey:  Princeton University Press, 1991.
  • Jaggar, A. “Feminist Ethics: Projects, Problems, Prospects.” In C. Card (ed.). Feminist Ethics.  Lawrence: University of Kansas Press, 1991.
  • Jaggar, A. “Feminist ethics”. In L. Becker and C. Becker (eds.), Encyclopedia of Ethics, New York:   Garland Press, (1992):  363-364.
  • Kleinman, A.  The illness narratives: Suffering, healing and the human condition. New York: Basic Books, 1988.
  • Korsgaard, C. M. The Sources of Normativity. Cambridge: Cambridge University Press, 1996.
  • Lindemann, H., Verkerk, M., and Walker, M. U. Naturalized Bioethics: Toward Responsible Knowing and Practice. Cambridge, MA: Cambridge University Press, 2009.
  • Lindemann, H.  Holding and Letting Go: The Social Practice of Personal Identity. New York: Oxford University Press, 2014.
  • Little, M. O.  “On Knowing the `Why’: Particularism and Moral Theory.” The Hastings Center Report 31.4 (2001): 32-40.
  • Lorde, A.  The Cancer Journals. San Francisco: Spinsters/Aunt Lute, 1980.
  • Lugones, M. (1987) “Playfulness, ‘world’-traveling, and loving perception”. Hypatia, 2: 3-19.
  • Lugones, M. and Spelman, M. (1983). “Have we got a theory for you! Feminist theory, cultural imperialism, and the demand for ‘the woman’s voice’”. Women’s Studies International Forum, 6(6): 573-581.
  • MacIntyre, A.  After Virtue: A Study in Moral Theory. Indiana: University of Notre Dame Press, 1984.
  • MacKenzie, C. and Stoljar, N.  Relational Autonomy: Feminist Perspectives on Autonomy, Agency, and the Social Self. New York: Oxford University Press, 2000.
  • Martin, C.  “Feminism, the Self, and Narrative Ethics.”  Macalester Journal of Philosophy 16:1 (2007):7-14.
  • Mattingly, C.  Healing Dramas and Clinical Plots : The Narrative Structure of Experience. Cambridge: Cambridge University Press, 1998.
  • McAdams, D. P. The Stories We Live By: Personal Myths and the Making of The Self.,  New York: The Guilford   Press, 1997.
  • McAfee, N.  “Feminist Political Philosophy”, The Stanford Encyclopedia of Philosophy (Summer 2014 Edition), Edward N. Zalta (ed.), URL =<http://plato.stanford.edu/archives/sum2014/entries/feminism-    political/>.
  • McCarthy, J. “Principlism or narrative ethics: must we choose between them?” Medical Humanities 29.2 (2003): 65-67.
  • Merleau-Ponty, M.  The Phenomenology of Perception. London and New Jersey: Routledge, 1992.
  • Meyers, D.  T.  “Narrative and Moral Life.” In Cheshire Calhoun (ed.).  Setting the Moral Compass: Essays by Women Philosophers.  Oxford University Press, 2004.
  • Meyers, D. T. Jaggar, A. (eds.).  Feminists Rethink the Self. Boulder: Westview Press, 1997.
  • Mullan, F., Ficklen, E., and Rubin, K. (eds.).  Narrative Matters: The Power of the Personal Essay in Health Policy. Baltimore: The Johns Hopkins University Press, 2006.
  • Nagel, T.  The View From Nowhere.  New York:  Oxford University Press, 1989.
  • Nelson, H. LStories and Their Limits: Narrative Approaches to Bioethics.  New York: Routledge, 1997.
  • Nelson, H. L. Damaged Identities, Narrative Repair. Ithaca: Cornell University Press, 2001.
  • Nelson, H. L. “Context: Backward, Sideways, and Forward.” HEC Forum: Special issue on narrative 11.1 (1999): 16-26.
  • Nelson, H. L. “7 Things to Do with Stories.” unpublished manuscript.
  • Nussbaum, M.  C. Love’s Knowledge: Essays on Philosophy and Literature. New York: Oxford University Press, 1990.
  • Nussbaum, M. and Sen, A. The Quality of life. Oxford:  Clarendon Press, 1993.
  • Nussbaum, M. and Glover, J. Women, culture, and development: a study of human capabilities. Oxford:  Clarendon Press, 1995.
  • Okin, S. M.  Justice, Gender and the Family. Basic Books: New York, 1989.
  • Ricœur, P.  Time and Narrative (Temps et Récit), 3 vols. trans. Kathleen McLaughlin and David Pellauer. Chicago: University of Chicago Press, 1984.
  • Rimmon-Kenan, S.  “The story of ‘I’: Illness and narrative identity.” Narrative 10.1 (2002): 9-19.
  • Rorty, R.  Contingency, Irony, and Solidarity. Cambridge: Cambridge University Press, 1989.
  • Schechtman, M.  The Constitution of Selves. Ithaca: Cornell University Press, 1996.
  • Shrage L.  Moral Dilemmas of Feminism: Prostitution, Adultery, and Abortion.  New York:  Routledge, 1994.
  • Sherwin, S.  No Longer Patient: Feminist Ethics and Health Care.  Philadelphia:  Temple University Press, 1992.
  • Strawson, G.  “Against Narrativity.” Ratio 17.4 (2004): 428-452.
  • Superson, A., “Feminist Moral Psychology”, The Stanford Encyclopedia of Philosophy (Winter 2009 Edition),  N. Zalta (ed.), URL = <http://plato.stanford.edu/entries/feminism-moralpsych/>.
  • Taylor, C.  Sources of the Self: The Making of the Modern Identity.  Cambridge: Harvard University Press,1992.
  • Tessman, L.  Burdened Virtues : Virtue Ethics for Liberatory Struggles.  New York : Oxford University Press, 2005.
  • Tomlinson, T.  “Perplexed about Narrative Ethics.” In H. L. Nelson (ed.). Stories and Their Limits: Narrative Approaches to Bioethics.  New York: Routledge, 1997.
  • Tong, R.  Feminist Thought: A More Comprehensive Introduction, 3rd edition, Boulder, CO:  Westview Press, 2009.
  • Tong, R. and Williams, N. “Feminist Ethics”, The Stanford Encyclopedia of Philosophy (Fall 2014 Edition), Edward N. Zalta (ed.), URL = <http://plato.stanford.edu/archives/fall2014/entries/feminism-        ethics/>.
  • Tuana, N. “Approaches to Feminism”, The Stanford Encyclopedia of Philosophy (Spring 2011 Edition), Edward N. Zalta (ed.), URL =<http://plato.stanford.edu/archives/spr2011/entries/feminism-      approaches/>.
  • Vollmer, F.  “The Narrative Self.” Journal for the Theory of Social Behaviour 35.2 (2005): 189–205.
  • Walker, M. U.  Moral Understandings: A Feminist Study in Ethics. New York: Routledge, 1997.
  • Walker, M. U. Moral Contexts. Lanham: Rowman & Littlefield Publishers, 2003.
  • Watson, G.   Free Will. Oxford: Oxford University Press, 2003.
  • Young, I. M.  Justice and the Politics of Difference, Princeton: Princeton University Press, 1990.
  • Young, I. M. Intersecting Voices: Dilemmas of Gender, Political Philosophy, and Policy, Princeton: Princeton         University Press, 1997.
  • Young, I. M. On Female Body Experience: “Throwing Like a Girl” and Other Essays.  New York:  Oxford University Press, 2005.

 

Author Information

Anna Gotlib
Email: AGotlib@brooklyn.cuny.edu
Brooklyn College of City University of New York
U. S. A.

Charles Hartshorne: Neoclassical Metaphysics

HartshorneCharles Hartshorne (1897-2000) was an intrepid defender of the claims of metaphysics in a century characterized by its anti-metaphysical genius. While many influential voices were explaining what speculative philosophy could not accomplish or even proclaiming an end to it, Hartshorne was trying to show what speculative philosophy could accomplish. Metaphysics, he said, has a future as well as a past. He believed that the history of philosophy exhibits genuine, albeit halting and uneven, progress towards a comprehensive understanding of the nature of existence.

Philosophy was, for him, a dialogue that spans centuries, with partners whose wisdom has a perennial relevance. The two philosophers who most influenced him, and in whose work he found the greatest parallels with his own thinking, were Charles Sanders Peirce and Alfred North Whitehead. Hartshorne was co-editor with Paul Weiss of the first comprehensive edition of Peirce’s philosophical papers, and he served as Whitehead’s assistant during the most metaphysically creative period of the Englishman’s career.

Hartshorne considered the metaphysical views he had begun to develop in his 1923 dissertation as, to a great extent, in pre-established harmony with Whitehead’s philosophy of organism. He indicated that Whitehead helped him sharpen his ideas and gave him a better vocabulary to express them, although there remained important differences between the two philosophers. One difference is that theism was always a central element of Hartshorne’s metaphysics (addressed briefly here, but see “Charles Hartshorne: Dipolar Theism” and “Charles Hartshorne: Theistic and Anti-theistic Arguments”) whereas Whitehead was preoccupied for much of his career with a philosophy of nature and did not introduce God until he developed the speculative philosophy of his later works.

Table of Contents

  1. The Nature Metaphysics
  2. The Question of Method in Metaphysics
    1. Dipolarity
    2. Inclusive Asymmetry/Concrete Inclusion
    3. Position Matrices
  3. Neoclassical Metaphysics
    1. Creativity
    2. Psychicalism
    3. Indeterminism and Freedom
    4. Personal Identity
    5. Time and Possibility
    6. The Aesthetic Motif
  4. Conclusion: Hartshorne’s Legacy
  5. References and Further Reading
    1. Primary Sources: Books (In Order of Appearance)
    2. Primary Sources: Hartshorne’s Response to his Critics
    3. Primary Sources: Selected Articles
    4. Secondary Sources
    5. Bibliography

1. The Nature Metaphysics

After his first book on sensation, Hartshorne’s philosophical work focused mostly on the questions of metaphysics (see “Charles Hartshorne: Philosophy and Psychology of Sensation”). In Creative Synthesis and Philosophic Method, he provides no fewer than a dozen definitions of “metaphysics” which, he argued, differ only as a matter of emphasis. Central to all of Hartshorne’s definitions is that genuinely metaphysical propositions are unconditionally necessary and non-restrictive of existential possibilities. If metaphysical propositions are true at all, they hold true of all possible world-states or state-descriptions. This means that they are propositions which are illustrated or exemplified by any conceivable observations or experiences when such observations or experiences are properly understood or reflected upon.

“Conceivable observation” is here understood in terms of Karl Popper’s notion that observation is always of the form “such and such is the case” rather than “such and such is not the case.” Cognitive definiteness is gained only by noting what is observed, rather than what is not observed, which is indefinite or infinite. Plato argues that negation is parasitic upon affirmation—“that which is not” is not contrary to what exists, but something different than what exists (Sophist 257b). In effect, quantificational criteria for identity can apply only to events that occur, not events which do not occur. The question, “How many storms did not occur?” has no definite answer. In Hartshorne’s view, there are no merely negative facts. Every negation presupposes some actually existing state of affairs. For example, to say that there are no swans in the lake is to say that every part of the lake is occupied by something other than a swan. Or, more generally, to say that swans do not exist is tantamount to saying that every location in the universe is occupied by something other than a swan. Sheer denials (claims purporting to state negative facts) represent an absence of positivity, and this is a key feature of metaphysical error. Properly metaphysical propositions are unique in never being falsified by any actual or genuinely possible states of affairs and in always being verified by actual or genuinely possible states of affairs. They represent, in effect, the kind of necessity defined since Leibniz and found in modern modal logic as “that which is common to all possibilities.”

This distinguishes genuinely metaphysical propositions from other kinds of a priori necessary propositions, such as truths of mathematics and hypothetical necessities. In Creative Synthesis and Philosophic Method (p. 162), Hartshorne maintains that mathematical propositions are non-existential, for they express relations between conceivable states of affairs. “Two apples plus two apples equals four apples” is an existential assertion containing a true mathematical relation, but “two slithy toves plus two slithy toves equals four slithy toves” is a non-existential assertion that nonetheless contains the same true mathematical relation. The bare arithmetic truth that “2 + 2 = 4” is neutral to existential instantiation. Similarly, “The number nine is not integrally divisible by two” is necessarily the case given the conventional meanings of the vocabulary of finite arithmetic. However, although no conceivable state of affairs falsifies the proposition, it is not verified by any conceivable state of affairs. And while hypothetical necessities express necessary relationships between possibilities, Hartshorne takes them to be covert denials that there are any states of affairs which falsify the relation asserted by the conditional. By contrast, genuinely metaphysical propositions are unequivocally affirmative, and their denials can only be sheer denials (as described above), expressions of utter absence or privation. The denials of metaphysical propositions are impossibilities; they are failed attempts to represent that which would never be found among possibilities.

As a prime illustration of a metaphysical truth, Hartshorne used the proposition, “Something exists.” This is properly metaphysical since it could not be falsified under any conceivable observational or experiential circumstances, yet it could be verified by every such circumstance; in fact, to assert both of these features is to assert something that is analytically true of the proposition, since any attempt to verify the proposition would posit, at minimum, a verification-event which would in turn falsify the counter-proposition that “nothing exists.” Some philosophers suggest that it is a contingent truth that something exists, as seems to be assumed by the question, “Why is there something rather than nothing?” In Creative Evolution, Henri Bergson said that one could attempt to arrive at the idea of nonbeing by imaginatively negating every true statement asserting the existence of something. Hartshorne points out, following Bergson, that this thought experiment is self-defeating. It ends in one of two ways: either there is no assertion, but only a denial, or there is an assertion that is self-referentially incoherent such as, “Nonexistence exists.” It is logically kindred to such “nonsense” propositions as “I was told something by nobody” or “I ate nothingness.” There is literally no possible state of affairs that could make “Nothing exists” true. If it is impossible for “Nothing exists” to be true then “Something exists” must be necessarily true.

If Hartshorne is correct that it is impossible for “Nothing exists” to be true, then there can be no state of affairs that meaningfully contrasts with “Something exists.”  To say that it is necessary that something exists does not provide any information about any existing thing; in other words, “Something exists” is too abstract to tell one about the concretely existing things (pluralism) or thing (monism) that may exist. This observation, however, presupposes the contrast between the abstract and the concrete. A further metaphysical question, therefore, is the relation that exists between the abstract and the concrete. “Something exists” does not describe an existing thing but rather presupposes the existence of entities (or an entity) more concrete than the sentence itself—this is the case even if, per impossibile, only the sentence existed, for “Something exists” is more abstract than “Only the sentence ‘something exists’ exists.” In light of these kinds of considerations, Hartshorne concludes not only that “Something exists” is necessarily true but also that “Something concrete exists” is as well, where the adjective “concrete” is the contrary of “abstract.” There is a hint of paradox in the fact that “concreteness” is itself abstract, but this leads to another of Hartshorne’s definitions of metaphysics as the study of the abstraction “concreteness.” Indeed, Hartshorne maintains that all metaphysical mistakes are instances of what Whitehead called “the fallacy of misplaced concreteness,” that it is to say, of mistaking an abstraction for what is concrete.

Conceivable propositions involve conceivable states of affairs in order for them to count intelligibly as propositions. Natural deduction systems of modern symbolic logic seem to make this supposition as in the decision of Whitehead and Russell in Principia Mathematica to make “there exists something X which either does or does not have an arbitrary one-place predicate P” axiomatic: in effect, they disallow an empty universe of discourse since an empty universe produces incoherence in the system such as counter-instances to the law of Universal Instantiation. While it is to be granted that free logics can avoid this assumption, it is also true that free logics entail difficulties precisely in determining their semantical domains. Most important, free logics that are designed to formalize ordinary language presuppose “objects” in both their inner or outer domains. Despite such monikers as “null inner domains,” such domains assume objects that are non-actual possibles. All free logics that have cognitive import for the description of “possible worlds” assume a semantical domain of objects that are conceptualized on the basis of actual objects or properties; for example, “Batman is a superhero” can be formalized in free logic, but it ultimately makes oblique reference to actualities (bat ears, masks, muscular strength, courage, and so forth) that are posed in non-actual combinations or juxtapositions. In effect, free logics can be interpreted in such a way that they do not contradict basic tenets of Hartshorne’s modal theory. Where cognitively meaningful, they assume objects as values for variables, and they formalize fictional scenarios that indirectly display the conceptual priority of actualities.

Hartshorne contrasts metaphysical propositions with empirical and contingent propositions, which are restrictive of some existential possibilities. An empirical proposition is essentially restrictive, always involving an actualization of a state of affairs that excludes other possible alternatives. For example, “Barack Obama resides in the White House during 2011” tells us about states of affairs obtaining in the White House during 2011, and it tacitly excludes the state of affairs of John McCain, his opponent in the 2008 presidential election, residing in the White House during 2011. This feature of exclusion among alternative possibilities is definitive of contingency, and, for Hartshorne, follows from Leibniz’s insight that the scope of disjunctive possibilities cannot be actualized simultaneously or conjunctively, since there are incompossible possibilities. Thus, the selection among possibilities confronted by natural processes must involve the acceptance of one alternative and the rejection of others, and this is a signature feature of empirical propositions. Hartshorne never considered the many-worlds interpretation of quantum theory, which by virtue of quantum branching into conjunctively realized alternative space-times, denies Leibniz’s principle of contingency as exclusion of alternatives. (For a critique of the so-called “actualist” account of many-worlds ontology and defense of the coherence of process philosophy and quantum theory, see Shields 2008.)

If empirical propositions are essentially restrictive, it follows that every empirical state of affairs is positive, but has negative implications. The denial of these negative implications is also an empirical state of affairs. For example, one alternative to Obama’s having won the 2008 presidential election is Hillary Clinton’s having won it. Since this alternative did not occur, the denial of this alternative, namely, “Hillary Clinton did not win the 2008 presidential election” is true of the actual world. However, if an empirical proposition is one which excludes alternatives, how is this true of negative empirical implications of such propositions? Is not a negative empirical proposition simply an assertion of an absence or privation? Hartshorne holds that this is clearly not the case. What is excluded from actualization in the above negative empirical statement is Hillary Clinton’s winning the 2008 presidential election, and this exclusion is achieved by a positive state of affairs. Positivity and exclusion of possibilities are thus features of all empirical propositions. Thus, unlike metaphysical propositions, empirical propositions have both an affirmative and a negative logical quality.

The division between metaphysics and empirical science is, in principle, clear. Hartshorne notes that, in practice, it is not always clear which statements count as empirical and which as metaphysical. It is well to keep in mind that Hartshorne uses Popper’s idea about falsifiability as a criterion of what it means to be an empirical statement and not as the guiding method of empirical science. Popper elevated falsification over verification as the proper method of science. Hartshorne does not address in a systematic way the question of the proper methods of science; even so, showing that a given statement is falsifiable is, on Hartshorne’s principles, one way in which it can be discredited as a true metaphysical idea. If a true metaphysical claim is falsified by no conceivable observation it is also the case that it is verified by every conceivable observation. Hartshorne holds that verifiability fails as a criterion for empirical statements but succeeds as a criterion for true metaphysical statements. It follows that false metaphysical ideas are falsified by every conceivable observation and verified by none.

A nuanced issue emerges, however, when one considers particular case studies of the relationship between metaphysical and empirical propositions in Hartshorne’s theory. Some critics have urged that Hartshorne’s neoclassical positions may sometimes conflict with apparently well-corroborated empirical scientific hypotheses. Among other hypotheses, these include (i) the apparent empirical result from special relativity theory that there is no cosmic simultaneity and thus no privileged or divine time (Hartshorne’s theory of deity posits a temporal God), or (ii) the beginning of physical events in space-time a finite time ago as posited in standard hot big bang cosmologies (Hartshorne’s metaphysics of creativity posits an infinite past of cosmic epochs, the latest of which is our actual cosmos since the purported big bang event). Such apparent conflicts, however, do not actually speak to Hartshorne’s general theory of the difference between metaphysical and empirical propositions. Rather, they concern whether the specific propositions he proposes as metaphysical are in fact illustrated by any conceivable state of affairs.

While Hartshorne can be described as a kind of rationalist insofar as he maintains, like classical rationalists such as Descartes and Leibniz, that metaphysics is a matter of consistent and adequate meanings of concepts, he is hardly a dogmatic “armchair” or purely speculative philosopher who desires no engagement with the special empirical sciences—his first and thirteenth books demonstrate that he was a serious psychologist and ornithologist. His rationalism is in fact “critical” and rather severely qualified. For instance, a propos of the above comment regarding the question of the “success” of his metaphysical project, Hartshorne speaks in Creative Synthesis (Ch. II) of metaphysics as our quite “contingent ways of trying to become conscious of the non-contingent ground of contingency,” and he insists on the qualification that the notion of the a priori should hardly be conflated with the epistemic notion of “certainty.” With Whitehead, Hartshorne insists that philosophers should be epistemically wary by avoiding the “dogmatic fallacy” such as found in the confidence of the Continental rationalists. In “The Development of My Philosophy” (1970) Hartshorne declares, “All philosophizing is risky: cognitive security is for God, not for us.”

There are at least three considerations which make it clear that, at the very least, it is not obvious that Hartshorne’s neoclassical metaphysics conflicts with the above mentioned empirical hypotheses, or that he is cavalier about empirical challenges. Following Popperian distinctions, Hartshorne never claimed that his proposed metaphysics is in principle exempt from empirical disconfirmation, although it is exempt from the quite distinct notion of empirical confirmation. If a “metaphysical” proposal really does conflict with an empirical fact, then it is disconfirmed and fails to be a genuinely metaphysical proposition. No genuinely metaphysical proposition, however, could be “empirically confirmed” in the standard sense that some restrictive state of affairs as opposed to another illustrates the proposition, because this would deny the universality of the candidate metaphysical proposition’s requirement that it be illustrative of any conceivable state of affairs. This requirement does not prevent it from being the case that some states of affairs are phenomenologically “privileged” in the sense that certain metaphysical truths may be more readily apparent in special cases of phenomena. Hartshorne agrees with the early Heidegger that metaphysics can be about profoundly general concepts, yet such concepts are neither phenomenologically vacuous nor inexplicable nor utterly without discernible structure. For instance, the process metaphysical theory of the necessarily “social structure of all experience” might be seen with particular clarity via the special phenomenon of the “active concern” (Heidegger’s sorge) of human being.

The determination of the relevant “empirical facts” (or interpretations of them) which a philosopher is forced to accept is a subtle, highly theory-dependent and much disputed matter, especially regarding the above mentioned cases of relativity theory and big bang cosmology. For example, it is not clear or agreed upon by philosophers of science that relativity physics establishes that time is “relative” even in Newton’s sense or that special relativity robs us of any objective, uniform notion of temporal modes of past, present, or future; nor is it clear that the standard big bang model, even if sound, “proves” the absolute finitude of either time or creative process as such. W. H. Newton-Smith, in The Structure of Time, argues that the notion of an “empirical” proof of a beginning of time even when granting a big bang singularity is highly problematic.

Hartshorne was cognizant of the prima facie tensions between relativity and big bang theory and his neoclassical metaphysics, and he offered plausible conciliatory suggestions: For example, consider his embrace of quantum physicist H. P. Stapp’s notion of a primordial, asymmetrically well-ordered sequence of events upon which space-time location is dependent. Stapp’s idea harmonizes the relativity of spatio-temporal observations dependent upon light-cone propagation and the ultimacy of ontological asymmetry demanded by process theory. Consider also Hartshorne’s observation that big bang theory establishes, at most, the contingent origin and present physical chronometry of time appropriate to our “cosmic epoch.” At any rate, whether or not these conciliatory suggestions are successful, Hartshorne attempted to follow through with the directives of his theory of metaphysics. As he says in Creative Synthesis, “there must be an at least possible way of harmonizing what physicists say is true of our epoch and what metaphysicians say is true of all possible epochs (since it forms the content of ideas of such generality that there is nothing we can think which is not a specialization of this content).”

2. The Question of Method in Metaphysics

            That Hartshorne thought at length about questions of philosophical method can be inferred from what Paul Weiss called the systematic “machinery” at work in his metaphysics, and from the very title of one his most important mature philosophical works, Creative Synthesis and Philosophic Method. Hartshorne’s method for neoclassical metaphysics results from both original insights and critical reflection on a wide swath of variegated influences. These range from the work of American pragmatists (especially Peirce), to phenomenology, to the speculative thought of Whitehead, to the work of analytic philosophers (with particular attention given to Popper and the logical investigations of his Harvard teachers Lewis and Sheffer as well as his University of Chicago colleague Carnap). The section titled “Reply to Everybody” published in The Philosophy of Charles Hartshorne lists no fewer than twenty-one methodological principles to be used in the proper adjudication of metaphysical claims. Among the most important of these are what could be termed the principles of “positivity,” of “dipolar contrast,” of “inclusive asymmetry,” and of Peirce’s doctrine of “position matrices or diagrams.” We explained the principle of positivity, or the rejection of purely negative facts, in the previous section, so let us turn to a discussion of the other principles.

a. Dipolarity

Hartshorne’s principle of dipolar contrast derives, in part, from the semantic “law of polarity” found in Morris R. Cohen’s A Preface to Logic. Following Cohen, Hartshorne holds that genuine metaphysical concepts are semantically interdependent. In effect, such concepts have logical contraries which cannot mean anything in utter isolation from one another. In spite of the extreme generality of metaphysical concepts, each such concept entails a polar contrast to it. Even the highly general concept “reality” requires that the concept “unreality” be assigned some meaning. To use Hartshorne’s illustration, the concept of “reality” ought to include the notion of having mental states, but the concept of “unreality” should include the notion of intentional objects of real mental acts which fail to designate anything extra-mental. Perhaps a more telling example could be found in the notion of necessity. A standard definition of necessity is “that which has no alternative,” yet alternativeness clearly invokes contingency, since a contingent state of affairs is to be characterized as “this rather than that alternative.” Hence, the semantical analysis of necessity invokes contingency. For Hartshorne, then, each metaphysical concept has a corresponding contrast: necessity requires contingency, being requires becoming, unity requires variety, and so on, for any concept that is non-restrictively general, having applicability across possible states or state-descriptions. The two interdependent contraries in each case warrant the term dipolarity.

Lack of recognition of dipolarity is, for Hartshorne, a chief difficulty in previous metaphysical theories that suppress expression of a polar contrast. In effect, they suffer from a certain conceptual poverty or “fallacy of monopolarity.” Monopolar theories allow expression of only one pole of a pair of contrasts; stated obversely, they completely deny one pole of a pair of contrasts. One example of denying dipolarity is monistic theories such as that of Spinoza, which allow causal necessity and internal relatedness, but which disallow contingency and external relatedness. At the opposite “monopolar” extreme are logical atomist theories like that of Russell, which allow causal contingency and external relatedness, but which disallow causal necessity or internal relatedness. Hartshorne asks if these contrary extremes make any more sense than supposing that doors must have hinges on both sides or on neither side. Hartshorne’s metaphysical project is guided by the observance of dipolarity and thus conceptual inclusiveness; in his view, a neoclassical process theory of reality is structurally dipolar and offers comprehensive accommodation of both necessity and contingency, both causal determination and a degree of freedom from such determination, both internal and external relations, and so forth, throughout the range of metaphysical polar contrasts.

b. Inclusive Asymmetry/Concrete Inclusion

Hartshorne’s principle of dipolarity is complemented and qualified by a principle of inclusive asymmetry or concrete inclusion. As Hartshorne points out, the principle of dipolarity does not justify metaphysical dualism. One should distinguish between asserting that a metaphysical concept requires a contrary polar conception in its definition, and asserting that two polar concepts have an equivalent metaphysical status. It may well be the case that one concept requires the other polar concept in its definition, while the other polar concept both requires the polar contrast in its definition, and yet is itself the ground or source of that polar contrast. In other words, it may be the case, as Hartshorne asserts, that dipolarity is itself grounded in a logically asymmetrical relation between the contraries.

The model for this relation can be seen in logical implication, which Hartshorne, following Peirce’s trail-blazing work on “illation” as logically fundamental or primal, takes to be the ultimate concept in formal logic and a resource for metaphysical generalization. “p implies q” means that p both implies itself and q. This can be formally expressed in the tautology that (p ⊃ q) iff [(p ⊃ p) & (p ⊃ q)]. (This result is mirrored in Lewisian systems in which the formula—changing material implication to strict implication—is a theorem.) However, given a standard material implication, p ⊃ q (where p and q are not equivalent in meaning), we cannot say conversely that q logically implies p. This is reflected in the fact that the correlative formula (p ⊃ q) iff [(q ⊃ q) & (q ⊃ p)] is not a tautology, for it is false on the truth-tabular conditions that  p and q have opposite truth values, and thus implicitly involves a species of “fallacy of affirming the consequent.” (Analogously, the similar formula using strict implication is not a theorem.) Thus, entailment is essentially asymmetrical.

Consider furthermore the defining power of variant connectives of standard systems of propositional logic. For Hartshorne, it is immensely significant that the defining power of propositional operators or functions “varies inversely with symmetry.” The symmetrical function of logical equivalence, as in “p if and only if q,” has the least defining power of the propositional functions, since, even when combined with negation, it can be used to produce only eight (including itself) out of the sixteen propositional functions. On the other hand, the directional or asymmetrical functions, which contrast with the equivalence function, are constitutive of entailment. Hartshorne points out that Peirce, and then Sheffer, were the first to see that either the combination of negation and conjunction (“not both”) or the combination of conjunction and negation (“neither/nor”) are singly sufficient to define all the others.

The Sheffer functions (the “stroke” and “daggar”) are the most definitive functions, but they possess a triadic asymmetry that yet includes dyadic symmetry. We see this, Hartshorne notes, in their truth-tabular definitions. The Sheffer stroke is false if and only if both propositional variables are true, while the Sheffer daggar (also Peirce’s ampheck) is true if and only if both propositional variables are false. In effect, the triadic relation of the stroke, that is, the truth-value product of the binary Sheffer construction p|q, which is dyadically symmetrical in terms of its propositional truth-value assignments (p is true and q is true), stands as an asymmetry in terms of its truth value (that is, it is false in relation to symmetrical truth). Hartshorne finds a metaphysically ultimate pattern here, namely, symmetry within an all-embracing asymmetry.

Hartshorne holds that the relation between dipolar metaphysical contraries exhibits this asymmetrical structure. As an illustration, consider his argument in Creative Synthesis that “becoming” logically contains its polar contrast “being,” but not the converse. Suppose there is a reality, X, that does not come to be, that is eternal, and another reality, Y, that does come to be. The total reality, XY, is not eternal; XY comes to be, for Y itself is not eternal. This shows that becoming is the more inclusive category, for it preserves itself (becoming) and its polar contrast (being). No comparable argument can show that being can include becoming without destroying the contrast. The concrete or definite, the creatively cumulative, is the inclusive element, and is the key to the abstract, not vice versa. The concrete and the abstract are neither sheer conjuncts as posited by varieties of dualism, nor some mysterious “third” entity, but, in consonance with both Whitehead’s ontological principle and Aristotle’s ontological priority of the actual, is rather, “the abstract in the concrete.”

In his “Logic of Ultimate Contrasts” (Creative Synthesis, Ch. VI and Zero Fallacy, Ch. VII), Hartshorne calls the concrete terms in a pair of metaphysical contraries the r-terms (correlated with Peirce’s categoreal “seconds” and “thirds”), while abstract terms are called a-terms (correlated with Peirce’s categoreal “firsts”). While he provides 21 r-terms and 21 a-terms in his table of metaphysical contraries, a few samples could be taken as illustrative, especially given his Rule of Proportionality, namely: as any one r-term stands to its contextually correlated a-term, all other r-terms stand to their contextually correlated a-terms.

graphic of r-terms

Hartshorne argues that each r-term includes its correlative a-term, but not vice versa. Given the items above, we see that, for Hartshorne, the analysis of experience should be constructed so as to include the notions that objects or things experienced are independent of or externally related to the contingent acts of experience which include the objects as their necessary (but not sufficient) conditions. If correct, these conceptual relations all exhibit the essential asymmetry of entailment. Yet, there is a two-way necessity within this overall asymmetry, for while the relation of logical inclusion falls always on the r-term side of the table, a-terms nonetheless necessitate that “a class of suitable r-term correlates be non-empty.” For example, the necessary can be expressed, Hartshorne contends, as “the non-emptiness of the class of contingent states of affairs.” (This particular rumination is a key feature of Hartshorne’s revision of the ontological argument; see “Charles Hartshorne: Theistic and Anti-theistic Arguments”.)

While the detailed arguments for and against proper adjudication of each case of r-term/a-term relation is a complex affair that cannot be presented here, it is interesting to notice that some independent considerations of modern logic arguably shore up Hartshorne’s basic principle of r-term inclusion. For example, Hartshorne pointed to the fact that an important theorem of contemporary modal logic “mirrors” the logical inclusiveness of contingent concreta or “r-terms” in juxtaposition with the abstract necessity or “a-terms,” namely, the theorem that [(Np & ~ Nq) ~ N(p & q)], where N is a modal operator for “necessarily.” In effect, the conjunction of  necessary and contingent propositions logically entails the modally contingent status of the conjunction of assertoric propositions—in effect, contingency in a relevant sense “includes” necessity rather than vice versa.

c. Position Matrices

Hartshorne also holds that metaphysical theories can be tested by subjecting them to processes of rational elimination and/or comparison of cognitive costs that begin with a formal logical elaboration of theoretical possibilities. This idea has its origin in Peirce’s doctrine of position matrices or diagrams. The point here is that no philosophical topic can be declared fully rationally adjudicated until the constituent fundamental aspects of that topic have been subjected to an exhaustive “mathematical analysis.” Much error can occur unless and until all possibilities have been foreseen and subjected to thorough rational consideration. Consider the issue of the God-world relationship in philosophical theology. Hartshorne argues that there are sixteen combinatorial possibilities for theological and atheological models of this relationship when we notice that the concepts of God and world can each be either ontologically necessary, ontologically contingent, can possess these modal properties in diverse aspects, or are neither ontologically necessary nor contingent. In the following matrix, upper case letters (N and C) represent ontological modalities as applied to God and lower case letters (n and c) represent ontological modalities as applied to the world. The zero case (O) represents lack of modal status or impossibility:

graphic God in all

Hartshorne’s matrix provides a method of making distinctions among various types of historically significant worldviews as well as highlighting the distinctiveness of his own position. For example: Parmenidean monism or classic Advaita Vedanta can be symbolized as N.o; early Buddhist thought is O.cn; Aristotle’s theism is N.cn; Aquinas’ theism is N.c; Stoic and Spinozistic pantheism is N.n; LaPlacean atheism is O.n; John Stuart Mill’s theism and most forms of deism are C.n; William James’s theism is C.c; Jules Lequyer’s is NC.c; Bertrand Russell’s atheism is O.c. Hartshorne argued that his preferred option (NC.nc) is the most formally inclusive of the theoretical options, and that no specific options are logically compossible (otherwise we would have modal incoherence or contradiction).

Hartshorne’s presentation of the position matrix representing necessity and contingency as applied to God and world developed over the course of his career. He did not come to the four-row, four-column arrangement until after his ninetieth birthday, with the help of Joseph Pickle at Colorado College. A more substantive change was in the way that Hartshorne interpreted the zeros. In Creative Synthesis, the zeros are the atheistic and acosmic positions. In later discussions, however, he interprets the zeros more broadly as “God is impossible (or has no modal status)” and the “World is impossible (or has no modal status).” To illustrate the difference between these interpretations consider the position of W. V. O. Quine. He would say that God does not exist, the world does exist, but the world has no modal status. This option cannot be represented as O.n, O.c, or as O.cn since each presupposes modal status for the world. Nor can it be represented as O.o without serious distortion, since Quine does not deny that the world exists. Another illustration of the problem is Robert Neville’s emphasis on apophatic theology. In Neville’s view, the necessary/contingent contrast is a product of God’s creative act; God cannot be characterized as either necessary or contingent, but only as indeterminate, at least prior to the act of creation. Hartshorne’s table, as presented here, finesses these issues by interpreting the zeros in a strictly formal fashion to mean “neither necessary nor contingent,” leaving open the possibility of further refinement.

Whatever one’s ultimate convictions about this particular topic, Hartshorne’s approach arguably represents an advance in metaphysical or philosophical theology since it provides a matrix that may well suggest missed possibilities in traditional or conventional ways of thinking about the topic. Furthermore, Hartshorne’s method can be extended: similar 16-fold matrixes can be made for other polar contrasts such as infinite/finite, eternal/temporal, and so on. If any two matrixes are combined (16 X 16) the number of formal alternatives leaps to 256. More generally, if m equals the number of contrasts one wishes to include in talking about God and the world, then 16m is the number of formal alternatives available. There is no apparent antecedent in the history of philosophical theology of Hartshorne’s matrices. It is no wonder, therefore, that he considered them as one of his original contributions to metaphysics.

3. Neoclassical Metaphysics

Hartshorne referred to his metaphysics as “neoclassical” to emphasize its continuity with classical traditions, especially as they sprang up in antiquity from the Presocratic philosophers and from Plato and Aristotle. He was also keen to stress that his views are importantly different, or new (“neo”), in their substantive claims. He would eventually highlight the parallels of his metaphysics with ideas in early Buddhist thought. The family of metaphysical views to which Hartshorne’s ideas belong is often called process philosophy or, following Bernard Loomer, process-relational philosophy. One finds anticipations of process-relational philosophy in Peirce’s tychism, James’s pluralistic universe, and Bergson’s la durée. Hartshorne was influenced by these philosophers (with Peirce being the most dominant of the three) but his greatest debt was to Whitehead.

a. Creativity

Philosophers venture various hypotheses as to the character of the finally real constituents of existence. One remembers Parmenides’ Being, Democritus’ tiny impenetrable atoms, Aristotle’s hylomorphic ousia, Descartes’ dual substance ontology, Leibniz’s monads, and Whitehead’s actual entities. Hartshorne adopted a Whiteheadian view, sometimes speaking of “dynamic singulars” instead of “actual entities.” Dynamic singulars are instances of what Hartshorne called “creative experiencing,” an expression that suggests an activity of synthesis, a bringing together of diverse elements from an entity’s antecedent world into a unity of feeling. Hartshorne often used Whitehead’s word “prehension” to name the feelings from which a dynamic singular weaves its own experience from the welter of data from its past. The “diverse elements” from the past that are synthesized are themselves instances of creative experiencing; for this reason, Hartshorne was fond of the expression “feeling of feeling,” which is close to Whitehead’s language in Process and Reality (Ch. X, sec. II). The prime example on which both Whitehead and Hartshorne model this activity is memory. Memories are themselves experiences that may have previous experiences as component parts. Moreover, memories are active in the way that they highlight some items of experience but place other items in the background, sometimes almost forgotten. Memory also serves as a model of the way emotional tone suffuses experience, in accordance with Hartshorne’s theory of the affective continuum. Finally, in keeping with process-relational philosophy, memory is a process, a coming-to-be, and not an unchanging substance; its very existence, moreover, depends upon its relation to past events.

Hartshorne agreed with Whitehead when the latter spoke of creativity as “the category of the ultimate.” In Whitehead’s words, “the many become one and are increased by one” (Process and Reality, Ch. II, sec. II). For both Whitehead and Hartshorne, creativity is not itself a substance but rather the name for the activity that characterizes every concrete particular, from the lowliest puff of existence to God. Thomas Aquinas restricted creativity in the strict sense to deity alone. Whitehead and Hartshorne, on the other hand, treat creativity as what medieval thinkers called a transcendental, a universal concept that is not restricted to this or that kind of real thing but which identifies a thing as such as real. Another departure from traditional ideas about creativity is that, for Whitehead and Hartshorne, creativity is never from nothing (ex nihilo), whether it is God’s creativity or the creativity of individuals within the cosmos. According to Hartshorne, the “nothing” in the expression “creatio ex nihilo” would be a purely negative fact. As noted in a previous section, Hartshorne rejects the existence of such facts. Thus, Hartshorne concluded that a creative act always presupposes an antecedent world from which the novel act arises.

Hartshorne’s emphasis on creativity illustrates his commitment to the principle—summarized in the previous section—that that which comes-to-be (becoming) includes but is not included by that which is but does not come to be (being). Hartshorne insists on taking “becoming” in the strictest sense as a process that adds to the definiteness of reality something that was not included in the class of real things prior to the act of becoming. Nothing corresponds to the word “reality” considered as a single nontemporal or eternal fact; rather, reality grows with every act of becoming and is, as it were, defined by them. Hartshorne rejects the idea that there is literally “nothing new under the sun”; on the contrary, there was a time when even the sun was new. Hartshorne is not simply reaffirming the flux of Heraclitus where all concrete things change; he is affirming that reality is a growing totality, an idea that is also prominent in Peirce’s evolutionary cosmology. The growth of reality, moreover, is thoroughly temporal—time itself is the process of creation. The past is determinate, the future is a field of relatively indeterminate possibilia, and the present is the process of determination. Finally, Hartshorne argues that what comes to be, once it has become fully determinate, is a permanent fixture of all subsequent becoming, guaranteed in the final analysis by God’s memory of it. This is why Hartshorne speaks of creation as a cumulative process.

b. Psychicalism

The fact that, for Hartshorne, experience is ontologically foundational means that his metaphysics is a type of what has traditionally been known as panpsychism. Early in his career, Hartshorne used “panpsychism,” distinguishing true and false versions of the doctrine. Later he preferred “psychicalism” and he said that he did not object to David Ray Griffin’s word “panexperientialism.” Hartshorne attributed mind-like qualities to every concrete particular (that is, dynamic singular), but his metaphysics cannot be described as anthropomorphic. He accepted Leibniz’s two-fold criticism of Descartes that self-consciousness is not the only form of human experience, and that human experience is not the only form of experience. In his second book, Beyond Humanism, Hartshorne points out that a dog need not become a human in order to suffer. In keeping with the theory of the affective continuum Hartshorne conceives mind-like qualities as existing along a continuum from the simplest feelings to the most complex thoughts. He argues that it is precisely in its psychological characteristics that it is possible for a nonhuman being to be infinitely other than a human being. This is because psychological variables such as memory, feeling, and volition are infinitely variable. Memories are conceivably of any span (a few seconds, a million years, and so forth) and of any condition of vagueness or precision; feelings can be any degree of intensity or complexity; volitions, which presuppose memory and feeling, are likewise infinitely variable.

Hartshorne denied the assumption of much of modern philosophy that an experience can have only itself as an object. The errors of waking experience as well as the false impressions during dreams provide no sure ground for a global skepticism—in the words of Peirce, “as if doubting were as easy as lying.” Hartshorne maintains that the question “What if all of our experience is a dream?” is based on a faulty phenomenology of dreaming and he points to Henri Bergson’s small book, Le rêve. Bergson argued that, during dreams, perceptions are indistinct, memory is free-floating, and attention is mostly disengaged, but the connection with the world through the body is never severed. Events and concerns of the day as well as immediate stimuli from the environment regularly appear in our dreams. Hartshorne gives the example of having dreamed of a propeller airplane and, as he awoke, hearing the sound of the airplane blend imperceptibly into the sound of a fan blowing in the room. As perception is not lacking in dreams, so more generally experience is always of something not itself. What philosophers call “the given” in experience are, according to Hartshorne, the independent causal conditions of the experience. Introspection too conforms to this model: it is a present experience having the immediately previous experience as an object. Experience, at every level imaginable, is essentially social—dynamic singulars feeling the feelings of others.

Hartshorne rejects the assumption that minds are essentially non-physical entities. Even Descartes, who argued for this claim, acknowledged that certain mental qualities are experienced as spread throughout one’s body or as being in specific regions of the body. Mental and physical qualities are indeed distinguishable but it does not follow that they are separable. Descartes raised the question of the criteria for the presence of mind in a physical object, thereby making materialism the default position for anything outside one’s own consciousness; since, however, mind-like qualities are so pervasively present in varying degrees in so much of nature, Hartshorne asked for the criteria for the absence of mind. The problem, as Hartshorne sees it, is as much with the concept of mind as with the concept of the physical or of matter. He raises the question whether there is anything that positively corresponds to the concept of a merely physical entity, that is to say, a physical entity in which mind-like qualities—not simply human mind-like qualities—are wholly absent. To be sure, there are physical entities in which mind seems to be absent, but Hartshorne argues that this is no more evidence of the absence of mind than the appearance of inactivity in a physical object is evidence that there is no activity in it. Leibniz guessed otherwise and modern science is on his side; the micro-world, even where apparently “dead matter” is concerned, is buzzing with activity. The old adage, “absence of evidence is not evidence of absence” applies.

In arguing for the ubiquity of mind-like qualities, Hartshorne found inspiration in certain aspects of Leibniz’s panpsychism. With Leibniz, he distinguished parts and wholes. The parts—Hartshorne’s dynamic singulars—have mind-like qualities even if some wholes of which they are made lack them. He argues by analogy that feeling can be everywhere even though not everything feels. For instance, a flock of birds does not have feeling, but there are feelings in the individual birds. Hartshorne explains the difference using a modified version of Leibniz’s concept of a “dominant entelechy” according to which some physical systems are organized in such a way that the experiences of the dynamic singulars (the parts) can be channeled into a single more or less unified stream of experience or even conscious experience, as in the case of animals with complex nervous systems. In Hartshorne’s theory, the body not only reacts to the world around it, but also reacts to itself. We feel the feelings of at least some of our cells. As Hartshorne said, hurt my cells and you hurt me. Some organic wholes, such as plants, do not have a structure integrated enough to allow for a dominant stream of experience. Hartshorne viewed plants as having no feeling, but he attributed feelings to their individual cells. He held that the phototropism of a flower tracking the sun is more a function of the activity of the cells than of the plant as a whole. Hartshorne generalizes this analysis along Leibnizian lines to the inorganic world. Leibniz spoke of monads in inorganic substances as being in a “stupor.” Hartshorne attains a similar result in his theory of the infinite variability of mind-like qualities. There is no such thing as “mere matter,” only matter in which mind-like qualities are far removed from what is recognizably human-like, animal-like, or even cell-like. With Leibniz’s distinctions, Hartshorne is able to theorize that there is experience in every object, but not every object of experience is an experiencing object.

Despite Hartshorne’s use of Leibniz’s ideas, the dissimilarities between their versions of panpsychism are as striking as their similarities. As already noted, dynamic singulars are entities that come to exist in the creative-cumulative advance of the world; Leibniz’s monads do not come to exist within the universe but are coexistent with it. For Leibniz, God’s creation of the universe is nothing more than God’s creation of the monads that make it up. Another significant difference between the two philosophers concerns relations of cause and effect. Leibniz denied causal relations among nondivine monads—they are “sans fenêtres” (windowless); he secured the appearance of relations of cause and effect by positing a divinely imposed pre-established harmony. For Hartshorne, every dynamic singular is both a partial result of causal conditions that precede it and a partial causal condition of events that succeed it. In short, every dynamic singular is both an effect and a cause. The word “partial,” especially as regards the relation from cause to effect, is important. Hartshorne rejected determinism (see below), and this represents another departure from Leibniz. For Hartshorne, causal conditions are necessary, but not entirely sufficient, for the emergence of a dynamic singular. The individual’s response to its own causal past—the way it synthesizes the world given to it—provides an ineradicable aspect of the explanation for why it is the way it is. It acts and is not merely acted upon. According to Hartshorne, the same principle applies to God, although allowance must be made in the divine case for the modal difference between existence and actuality (see below). The twin ideas that there are real relations among dynamic singulars and that each is unique by virtue of its manner of experiencing the world highlight two features of Hartshorne’s metaphysics. First, reality has a social structure (see below, under “personal identity” for a discussion of the meaning of “social”); second, every concrete particular that “makes” the world retains at least a minimal degree of freedom.

One objection to Hartshorne’s theory is that mental qualities seem to require a central nervous system. In Beyond Humanism, Hartshorne makes several points that are crucially relevant to this objection. He notes that among animals with central nervous systems, physical and psychical qualities are correlated. Hartshorne observes, in an almost Teilhardian turn of phrase, that physical complexity is a sign of psychical complexity. The more complex is the mental life, the more complex is the nervous system that underlies it. Can one generalize beyond creatures with a nervous system? Hartshorne points out that one-celled animals manage the functions of digestion, oxygenation, and locomotion without the organs and body parts that in creatures with nervous systems make these possible. He asks whether mental function, broadly conceived, may not be analogous. Is it any more reasonable to say that a paramecium feels nothing because it lacks a central nervous system than it is to say that it cannot swim because it has neither motor nerves nor muscle cells? If it has primitive feelings, then it displays them behaviorally in the only way it could, by responding to stimuli. Hartshorne argues that the only conclusion that can be drawn from physiology is that similarity of mind between a one-celled creature and a human is limited by the dissimilarity of their bodies. Physical wholes insufficiently organized to allow a dominant stream of experience are the closest thing in Hartshorne’s philosophy to what materialists call “matter.”

An important objection to Hartshorne’s psychicalist theory is suggested in the work of Karl Popper. In his classic treatise on the mind-body problem titled The Self and Its Brain (co-authored with neuroscientist John Eccles), Popper objected to “psychicalist” or “proto-mental” conceptions of the brain’s elementary particles, arguing that such conceptions have no empirical explanatory power and are thus “metaphysical in the bad sense.” Popper maintains that elementary particles can have no “interior states” because they are “completely identical whatever their past states.” For example, any arbitrary proton selected at any time for measurement will have the same physical properties as any other proton selected at any time for measurement: its mass will be 938 MeV/c2; its charge +1; and its spin ½.

Contra Popper, it does not follow from this that elementary particles are absolutelypredicatively identical no matter what their past states. To use Hartshorne’s dipolar vocabulary, Popper is here conflating “gen-identity” (identical characteristics over time) and “strict identity.” Such properties as mass, charge and spin are gen-identical features of protons that are present in each proton-occasion. However, protons do not remain static in terms of their empirically discernible behavior over periods of time. For example, a proton P in a tritium nucleus of hydrogen (a nucleus of hydrogen with one proton and two neutrons) has a rate of radiation decay as compared to a distinctive proton P* in a lead-206 nucleus (one of the four “stable” isotopes of lead), which has no such decay, as is now familiar to us through the “half-life radiation law.” Notice that the behavioral differences occur precisely because of differences in physical contexts. That physical context matters to the behavior of protons is readily explicable in a Hartshornean interpretation of elementary particle-occasions, because such particle-occasions are “open” to their environments—in Whitehead’s vocabulary, the environments are their “actual worlds”—through prehension. More recent empirically well-corroborated developments in quantum physics are likewise readily explicable in Hartshorne’s psychicalist interpretation, again through the notion of prehension. One may note in this regard the phenomena that (a) “information transfer or influence” occurs between well-separated particles faster than light-cone propagation (that is, quantum entanglement) and (b) that physical states are discernibly influenced by the selection and rapidity of an observation or measurement process (that is, quantum Zeno effect). It may well be no accident that one of the first philosopher-physicists to devise experimental tests for quantum entanglement phenomena was Abner Shimony, a student of Hartshorne’s at the University of Chicago, who has remained indebted to the “Whiteheadian paradigm.” In neuroscience, the emergence of neuroplastic phenomena in which rigorously repeated thought or “attentional” exercises have an empirically discernible effect upon brain metabolism as shown through PET-scans also conjures a top-down causation model which again can be readily handled by a Hartshornean interpretation of particle-occasions as prehensive. Thus, Popper’s dismissive estimate of the empirical explanatory power of psychicalist or panexperientialist concepts seems to be, at the very least, seriously challenged by more recent developments in physics.

Hartshorne believed that his concept of the infinite variability of mind-like qualities provides the theoretical bridge to extend the categories of experience beyond the human, the animal, or even the organic. He does not deny that these speculations about the possibility of radically non-human or non-animal minds are, for the foreseeable future, of little or no use to much of science. Physics, for example, need not worry whether atoms or electrons have “feelings”; but this may simply be a way of saying that what is of interest to metaphysics is not necessarily of interest to physics. In a 1934 article in The New Frontier, Hartshorne characterized physics as the behavioristic aspect of the lowest branch of comparative psychology, or even of comparative sociology since reality, in his view, has a social structure. Hartshorne argued further that psychicalism is the metaphysic best suited to an evolutionary world-view. Psychicalism does not face the problem of the emergence of mind from what is wholly lacking in psychical qualities. Hartshorne calls this view “temporal dualism”; all of the problems of mind-body dualism of how to relate nonphysical mind to nonmental matter are repeated, only in an evolutionary context. For Hartshorne, on the contrary, new forms of mind emerge in the process of evolution, but not mind itself.

c. Indeterminism and Freedom

The philosophy of creative becoming is inherently anti-deterministic. This is not to say that Hartshorne denied relations of cause and effect or that he rejected the laws of nature discovered through scientific investigation. It is all-too-common for philosophers to argue that the falsity of determinism implies chaoticism, the doctrine that there exists, at most, an appearance of causal regularity in the world. By way of clarification, Hartshorne noted that determinism posits absolute modal regularity in the sense that, for every set of causal conditions, it is not only the case that, then and there, there is only one effect that will occur (which may well be a truism), but there is only one effect, then and there, that can occur (note that “can” is a modal concept). As William James argued in “The Dilemma of Determinism,” if some sets of causal conditions allow for more than one possible effect, then determinism is false. Therefore, the logical contradictory of absolute regularity is non-absolute regularity, not absolute irregularity (chaoticism). Absolute irregularity is the logical contrary, not the contradictory, of determinism. For this reason, Hartshorne argues in Wisdom as Moderation that determinism and chaoticism are the extreme metaphysical positions, both of which may be false. If both are false, then some form of indeterminism must be true.

Determinism has sometimes gone by the name of the doctrine of necessity, as in Perice’s famous article “The Doctrine of Necessity Examined.” The meaning of “necessity” as it applies to determinism is that a specific effect could not have been otherwise given the causes that brought it about; in other words, causes necessitate their effects. Indeed, determinists seek to minimize the extent to which events seem contingent—that they could have been otherwise—by uncovering their causal antecedents. The deterministic theory is that all contingency in the world, which is to say, all of the variety and novelty or all deviations from absolute regularity, are apparent only. Alternate effects seem possible, but determinists claim that this is only because of our ignorance of all of the factors—the causes and the laws that link cause to effect—that explain a particular effect. Nevertheless, hidden within the seeming contingency of our ignorance is another necessity: the causal nexus of events absolutely fixes the details of our knowledge in any given situation. Of course, whether determinism or indeterminism is correct, some degree of ignorance and fallibility is an inescapable aspect of the human condition. The indeterminism espoused by Hartshorne also admits of unknown causes that limit what is possible. For example, an athlete may eat breakfast with plans of competing later in the day, not realizing that the food she is eating is contaminated and will incapacitate her. In Hartshorne’s theory, however, contingency is not merely a function of ignorance; on the contrary, sometimes there are real alternatives, no one of which the concatenation of causal conditions entirely eliminates. The incapacitated athlete, for example, may nevertheless have a variety of real alternatives for how to respond to the food poisoning.

Peirce argued, and Hartshorne agreed with him, that one cannot help but posit real alternatives: either reality as a whole could have been otherwise or contingency enters the world piecemeal or incrementally. Determinists may attempt to eliminate contingency within the universe by tracing events to their causal antecedents—to a singularity at the beginning of the universe or to an eternal decree from deity—but there remains the question of why the universe has the exact initial conditions that it has. There is no plausible modal theory that would allow one to consider the contingency of the initial conditions as a hidden form of necessity. Thus, contingency is unavoidable, or as Hartshorne says in Creative Synthesis (Ch. II), “There can be no alternative to alternativeness itself.” Hartshorne, following Peirce and James, locates the contingency of the universe not in an absolute beginning or in the divine will but within the universe’s own creative processes—in Hartshorne’s words, contingency “seeping into the world bit by bit” (Creativity in American Philosophy, Ch. X). James spoke of “pluralism’s additive world” and this is Hartshorne’s view: the coming-to-be of each dynamic singular introduces a morsel of novelty into existence and, in so doing, adds itself to the universe. Every subsequent dynamic singular must take account of this prior addition to the universe as a causal factor in its own emergence. In this way, there is a rhythm of the universe as each new subject of experience inevitably becomes a new object for a new experience.

It should now be clear that Hartshorne intended his version of indeterminism to leave ample room for the massive regularities—the order—of the world that scientists make it their business to discover, but these regularities are not absolute as determinists conceive them to be. Hartshorne turned on its head the traditional doctrine that effects are contained in their causes; for Hartshorne, it is the other way around: at the most basic metaphysical level of analysis, causes are contained in their effects. Again, Hartshorne finds a clue in the experience of memory. One’s memory-of-X includes X as an indispensable causal component, but X as partial cause of one’s memory-of-X does not contain the memory itself. Hartshorne goes further and denies that memory-of-X is contained, implicitly or virtually, in the entire set of causal conditions leading up to the memory. In short, the causal antecedents of the memory provide the necessary but not the sufficient conditions of the memory. The present memory-experience is an instance of creative experiencing; as such, it adds a novel element to reality. Nevertheless, the causal conditions are limiting factors in what experience may result from them; the causes define a field of possible experience activity. Not just any effect can result from a particular set of casual conditions and this principle is enough to block the inference from indeterminism to chaoticism. This principle also provides the metaphysical ground of developmental processes. For example, every adult human has a developmental history beginning with a fertilized egg, but no single-celled zygote and its genetic make-up is sufficient to make an adult. The countless intermediate steps of growth and education, as well as the person’s own reactions to his or her circumstances, are required to complete the process.

Since at least David Hume, philosophers have acknowledged that empirical science cannot establish the truth of determinism. There remained, however, the idea that scientific explanations presuppose or require a deterministic framework. In Hartshorne’s reckoning, Peirce disposed of this claim once and for all. First, Peirce observed that measurements can be no more fine-grained than our instruments and our proneness to error will allow. There can be no empirical or scientific meaning to the concept of an absolute measurement. Second, the far reaching regularities in nature that a reasonable indeterminism posits are enough for the purposes of scientific theorizing; saying that the regularities are absolute, as determinism does, adds nothing. The much diminished levels of novel experiencing that Hartshorne’s metaphysics locates in the world of inorganic beings makes that realm as deterministic in appearance as it needs to be for the purposes of discovering laws of nature. To be sure, those laws must be understood as stochastic, but this fits well enough with scientific judgments which are couched in terms of probabilities rather than certainties. It is worth noting that Hartshorne did not look to subatomic physics for his main support for indeterminism, for he believed that the case against determinism had already been made by Peirce and others. As far as Hartshorne was concerned, quantum indeterminacies buttress the case against determinism by showing that physics, the supposedly most materialistic of sciences, does not require determinism. Even Einstein, who rejected indeterministic interpretations of quantum phenomena, did not deny that those interpretations were scientific.

Numerous philosophers use moral freedom as an argument—perhaps the central argument—against determinism. Hartshorne agreed that moral freedom is indispensable to a proper understanding of human life but he was more interested in defending a more generalized idea of freedom that extends beyond moral decision-making and even into the nonhuman realm. Freedom in this more generalized sense, as a creative act, complements and completes Hartshorne’s indeterminism. In The Logic of Perfection (Ch. VIII), he speaks of causality as crystallized freedom and freedom as causality in the making. As we have just seen, for Hartshorne, every effect is more than, and even includes, the causal conditions that make it possible. If one analyzes the effect, abstracting from its causes, one is left with the particular way in which a dynamic singular experiences its causal antecedents, which is the measure of novelty in it. The word “experience” may call to mind a merely private epiphenomenon, but Hartshorne insists that experience has an ineliminable public aspect as it becomes a datum for subsequent experiences—a cause for future effects. In Creative Experiencing: A Philosophy of Freedom, he stresses that this idea of freedom is essentially social. Every creative act is a combination of self-determination and determination by others. The creative act, once completed in a dynamic singular, becomes part-cause of subsequent dynamic singulars. In this way, cause and effect relations are explained by the more basic principle of freedom limiting freedom.

For all of Hartshorne’s animadversions on determinism and his advocacy of a philosophy of creativity, he was under no illusions about the limits, sometimes extreme limits, on freedom in any particular situation. In The Logic of Perfection (Ch. VIII), he speaks of a present creation as adding “only its little mite” to the vast totality of the universe. Hartshorne says that a phrase like “creative experiencing” escapes redundancy because there are degrees of creativity. As Hartshorne’s indeterminism provides the metaphysical ground for developmental ideas, so the concept of freedom limiting freedom provides the ground for a meaningful concept of degrees of freedom. Freedom increases to the extent that there are more options of more complexity, allowing for greater contrasts of feeling. The development of more and more complex organisms during the course of evolution makes for new levels of organizational structure (for example, in the convolutions of brains), more varieties of experiencing, and a widening range of possibilities for creative realization. The most dramatic example of augmented freedom occurs when organisms cross the threshold from experience to conscious experience. This occurred in the evolution of the human species but it is also the natural development within each member of that species. Hartshorne remarks on how the complexities of a symphony can be appreciated by a human being but they are hopelessly beyond the understanding of creatures with simpler brains. Consciousness also makes possible moral freedom which brings with it increased opportunities for achievement and for risk of failure in the attainment of high ideals. The opportunities and the risks go hand-in-hand in such a way that one cannot be had without the other.

d. Personal Identity

The attribution of responsibility for acts worthy of praise or censure involves the concept of a person, or more fundamentally, of agency. With the problematic exception of a supernatural deity that exists outside of time, persons do not simply exist, they persist; their existence requires days, months, and years. Dynamic singulars, as momentary flashes of experience, are not persons, but in Hartshorne’s view, they are the raw materials from which persons are made. One can say that a person is a whole of which dynamic singulars are the parts. Hartshorne adopts the more refined categories of Whitehead’s philosophy in order to express, in neoclassical terms, the concept of personhood. Whitehead spoke of a nexus as any “particular fact of togetherness among actual entities.” A society is a type of nexus whose constituents prehend (feel) a common element of form. Every mammal, for example, is a society of dynamic singulars, each of which inherits from its predecessors and passes along to its successors the form of “mammal.” A society is more than a mere mathematical set, for the common form of the society is passed along—shared by prehensive relations—from one grouping of dynamic singulars to another.

In the philosophies of Whitehead and Hartshorne, the existence of a person requires that there be a special type of society, one that exhibits personal order. A personally ordered society is a sequence of dynamic singulars, no two of which are contemporaries. This is the neoclassical metaphysical account of our sense of being persons that persist through time. Both Whitehead and Hartshorne emphasize, however, that personally ordered societies are embodied. A personally ordered society is a sub-society within the larger society that is the human body. Leibniz spoke of a dominant entelechy or soul associated with each animal body, itself a collection of monads; a personally ordered society is a very rough equivalent of this (taking into account all of the caveats mentioned in the discussion of psychicalism). Each dynamic singular making up a personally ordered society inherits not only from its predecessor in the sequence but also from the dynamic singulars that make up the rest of the body. The body, one might say, is the immediate environment of the soul, or more colloquially, the self. Whitehead and Hartshorne believed that a personally ordered society does not survive without the body. Although neither philosopher definitively dismissed the possibility of a limited post-mortem existence, they did not show the slightest interest in speculating on the details of such a possibility.

Hartshorne argued that his and Whitehead’s view of personhood avoids two extremes. A person is not, as Hume seemed to think, a mere bundle of qualities existing from moment to moment, with no internal relations among its component parts. Every dynamic singular within a personally ordered society is a creative appropriation of its successors in the sequence and in the wider environment of its body. As noted in the previous section, Hartshorne denied determinism without denying the efficacy of causal regularities. Certain kinds of damage to the brain, for instance, are real causal factors in seriously altering or even eliminating personality. The other extreme that Hartshorne claims to avoid is the denial of external relations among the components of the self. According to Leibniz, the identity of a monad, including a dominant entelechy, is in its “concept,” which is all of the properties that ever were or will be true of it. Hartshorne maintains that a person is a product of developmental processes that are inherently open-ended, allowing for different outcomes. For this reason, Hartshorne accepted the Jamesian view that one’s character as so far formed is no absolute guarantee of one’s future behavior. It is true, as is said, that people “act in character,” but one is also part-creator of one’s character. We meet here once again, but now as applied to the problem of self-identity, the protean nature of creativity in neoclassical metaphysics. As each dynamic singular in one’s personally ordered society emerges, one is a partly new self.

On Hartshorne’s principles, personal identity is not a matter of strict or mathematical identity. The additive nature of creativity entails that identity through time, or gen-identity, is relative only—a question of “more or less” rather than “all or none.” The unity of self-identity in a person is wholly a function of the inertia that past dynamic singulars carry into the present of a personally ordered series. Hartshorne sometimes spoke of this relation as being among past and present “selves.” James said that the present thought itself is the thinker. Hartshorne would agree, for it is not the past “selves” in a personal sequence that do the thinking; the present is where thinking occurs and where particular decisions are made. For most of us, most of the time, the broad outlines of our personality remain stable, allowing us to speak of ourselves as being “the same person.” Yet, dramatic changes are possible, for the better and for the worse. The annals of both brain science and of religious conversion are full of case histories of persons who undergo changes that are sufficiently global to speak of a new person being born. It is also worth noting that Hartshorne’s metaphysics allows for the possibility that a single body could support more than one personally ordered society; this might provide the outlines of an account of multiple personality or even of aspects of the unconscious mind.

Hartshorne’s theory of personal identity is not reductionist. It is, like his indeterminism and philosophy of freedom, inherently developmental. Consider the beginnings of a human life. In most cases, conception results in a full complement of chromosomes necessary for a human person to develop. Much more must be accomplished, even within the mother’s reproductive system, to complete the process. The single-celled zygote from which we grow is genetically human, but it is arguably not the individual we associate with being a person. For example, far from being one individual rather than another, the fertilized egg has the potential to divide to produce twins or triplets. Hartshorne noted that his twin brothers, James and Henry, were very different persons despite having the same genetic make-up. Another argument against reducing personhood to genetic structure is that the nervous system and a functioning brain, which provide the physiological basis of human personhood, are not present from the moment of conception; they are the result of development both in utero and after birth. These observations do not determine the moral or legal status of the unborn, but they are relevant to those questions, for they argue against reducing personhood to genetics. To be sure, the question of abortion is complicated. When does the unborn become a person with rights and how do these rights, assuming they exist, stand vis-à-vis a woman’s manifest right to self-determination? Hartshorne was firmly on the side of allowing women to decide for themselves, apart from interference from government or religion, whether to terminate an unwanted pregnancy. His position on abortion was basically that of Roe v. Wade. What Hartshorne’s metaphysic of personal identity brings to the debate is a robust rejection of reducing personhood to genetics and a corresponding emphasis on developmental categories. In The Second Sex, Simone de Beauvoir wrote that one is not born a woman, but becomes one. Hartshorne would agree and generalize the thought: one is not born a person, but becomes a person.

Hartshorne drew interesting ethical implications from his metaphysics of personal identity. Most notably, he argued that a metaphysics which includes such Whiteheadian notions as prehension, personally-ordered societies of actual occasions, and transmutation of conformal feeling, could never countenance what Hartshorne calls the “illusions of egoism.” Even more plausible versions of “enlightened” ethical egoism, which allow interest in others for the sake of welfare of self, are incoherent in Hartshorne’s reckoning. Enlightened self-interest theories are based on a partially true but misleading “common sense” conception of self-identity that fails to grasp the logical distinction between being an individual and being the concrete states of an individual. The former is an abstraction from the latter. No momentary state is strictly identical with any other but there can be enough continuity to speak of an abstract, relatively unchanging, character. As Hartshorne says in The Zero Fallacy (Ch. XII), “The identity is somewhat abstract, the non-identity is concrete. Without this distinction the language of self-identity is a conceptual trap.” When this distinction is grasped, we see that the claim to have an interest in self cannot be simpliciter or absolute, since there must always be an “other,” namely, the future concrete states of the individual self, to which the interests of the self in a concrete state now must be addressed. Moreover, the fact that (psychologically normal) individuals “enjoy the enjoyment of others” is grounded in the metaphysical structure of social selves, whose dominant occasions of experience are built up and transmuted by conformal feeling of the feeling-tone in constituent neural occasions. We are, quite literally in Hartshorne’s account, “members one of another.” That is to say, a “self” is precisely a creative synthesis of feelings of others through its “perceptual mode of causal efficacy” in Whitehead’s language. The capacity for feeling the feelings of others, in a word “sympathy,” is basic, and thus the capacity for altruism as well as selfishness is built into the nature of being a social organism.

e. Time and Possibility

Hartshorne’s philosophy of creative experiencing is inseparable from his philosophy of time. As already explained, he posits a universe that is forever in the making by the dynamic singulars that come to be. What has already been made is the past, what has yet to be made is the future, and the present is the locus of activity where future possibility becomes past actuality. This characterization of time is in one sense circular, for the definens presupposes the definiendum; for example, “yet to be” presupposes “future.” What keeps the definition from being vacuous is Hartshorne’s concept of creativity or making. Classical ideas about creation in Christian tradition, for example, place God outside of time as its creator. According to this theory, God brings the temporal world—past, present, future—into existence but the divine act itself is not in time. From God’s eternity, what is future for us is as fully detailed as any moment that has for us become past. Hartshorne, on the other hand, finds a fundamental asymmetry in temporal relations. There is no such thing, even from a divine perspective, of a future that is as fully detailed as the past. The future, as “yet to be made,” lacks details that will not exist until the making of them. The “making of them,” as already noted, adds something to the universe that was not previously part of it. The universe, and time itself, is nothing more than this process of accumulated and accumulating acts of becoming and all that they contain.

Some commentators are tempted to see in Hartshorne’s theory of time a variation on J. M. E. McTaggart’s concept of an A-series. However, in his article on “Time” for Vergilius Ferm’s 1955 Encyclopedia of Religion, Hartshorne distinguished his own ideas from those of McTaggart. McTaggart distinguished two ways of marking time: the A-series of relations of past, present, and future and the B-series of relations of before and after. McTaggart said that if one abstracts from the A-series and B-series relations, one is left with an ordered array of events, called a C-series, without temporal order of any kind. If a C-series is like a film strip, with each frame representing an event, the A-series is analogous to frames passing in front of the light of the projector; as the light shines through a particular frame the photo on that frame is a present event, beforehand it is a future event and afterwards it is a past event. By contrast, Hartshorne’s cumulative theory of becoming entails that there is no such thing as a C-series from which A-series relations could be abstracted. To continue the analogy, there is no film running on a projector with frames yet to be viewed. In short, there are no future events. At best, and in keeping with Hartshorne’s indeterminism, there is field of possibility that is only as detailed as the past determines it must be, all else in the field remaining essentially vague, awaiting full determination as novel dynamic singulars arise in the creative advance. By parity of reasoning, B-series relations are not fixed in eternity but are themselves results of temporal becoming. For example, the fact denoted by “Socrates died before Aristotle’s birth” could not exist until Aristotle was born. This is no mere limitation of human knowledge. After Socrates’ death and before Aristotle’s birth, there was no such relation as Socrates-having-died-before-Aristotle-was-born; what existed at the time of Socrates’ death was a range of recently emergent possibilities of someone or other being born after Socrates, for example: a-great-philosopher-born-fifteen-years-after-the-death-of-Socrates. As Hartshorne says in the Encyclopedia article, “Time is not a mere relation of becomings but a becoming of relations.”

Hartshorne grounded modal concepts in the temporal structure of the world. He often quoted, with approval, Peirce’s dictum that time is a particular variety of objective modality: the past is fully determinate or actual, the future is relatively indeterminate or possible, and the present is the becoming of the actual as the relatively indeterminate becomes determinate. Following the lead of both Peirce and James, Hartshorne argued that determinism denies the reality of time. As noted previously, the only objective modality where determinism is concerned is necessity. Hartshorne’s indeterminism, on the other hand, posits necessity in the direction from effect to cause; in the direction from cause to effect, however, there is an element of contingency, and this is the objective modality of the future. Determinists emphasize our ignorance of causes and the consequent inability to clearly perceive the necessary relations among all events. For the determinist, however, the ignorance includes the systematic illusion of time’s direction. From a practical point of view we cannot help but treat the illusion as reality. Aristotle remarked in Nichomachean Ethics (Bk. VI) that no one decides to have sacked Troy; however, the war (assuming its historicity) was once a matter of urgent decision in which the future was not something to be known but something to be made. For this reason, Hartshorne maintained that we act as though the future is relatively indeterminate even if we convince ourselves otherwise.

Hartshorne argued that the human capacity to form general conceptions and to frame principles that guide actions is another illustration that it is necessarily the case that we act as though determinism were false and time is real. The asymmetry between remembering a past event and planning for a future event is a powerful indication of the asymmetry of time. One may remember or misremember any amount of detail about a plan that has been carried out, but when the plan has yet to be executed, the only details that can be known are ones within the plan itself. As a script for future activity, the plan is abstract compared to its eventual realization. One may remember having taken one’s dog for a walk, including the memory of having intended to take a particular route; however, the memory of the originally intended route cannot include everything that happened on the walk: on this walk, a toddler peered at you from beside a car, a fallen branch blocked your path, you stepped on two ants, a street lamp burned out, a raccoon scurried into a sewage drain—these and countless other details were not included in the plan. Of course, what one anticipates by way of plans, intentions, or purposes, can be more or less specific. Regardless of the amount of detail, however, one’s future projects leave innumerable particulars undecided. When things go “as planned” it is not because every aspect of the plan matched some detail fixed in advance, for there are many ways that plans can be successfully fulfilled. Musicians know that every musical score leaves a great deal to be decided; different performances can be equally faithful to what the composer wrote.

Hartshorne realized that if his theory of modality as essentially temporal is correct then there can be no such thing as merely possible worlds that are not anchored in the actual world. At most, there are possible world-states; that is to say, there are ways the actual world might have been. For any given past event, there was a time when something else might have occurred in its place. We can ask “What if?” about the past in order to conceive of ways the world might have been different, even though nothing can now be done to change what occurred. The future, on the other hand, is the arena of what might yet occur given the actual history of the world up to the present. Hartshorne’s view contrasts neatly with Leibniz’s idea that possible worlds are completely detailed descriptions of universes that God might choose to create. Possible worlds, in the Leibnizian sense, contain possible persons. As Leibniz argues in his correspondence with Arnauld, when the “concept”—the complete description of a possible person—is made actual by God, the person exists; the making actual of a different “concept” (that is, altering the description in some way) would result in a different person. Hartshorne objects that persons cannot be merely possible. Contrary to Leibniz, an actual person could have had properties other than what it has and the properties that it has could have been had by others. For example, Hillary Clinton could have been elected the U.S. president in 2008 and someone besides Hillary could have been Bill Clinton’s first lady. A fictional character, on the other hand, has no reality beyond the description of it; it has enough specificity to simulate a real person, but no feat of magic could transform it into a real person. Hartshorne’s arguments clearly anticipate and dovetail with those of Saul Kripke in Naming and Necessity. Kripke maintains that a proper name designates the same object across possible worlds (for example, Hillary Clinton) whereas a description designates different objects from world to world (for example, “winner of the 2008 U.S. presidential election”). Kripke also suggested that “counter-factual situation” or “possible state (or history) of the world” are less misleading expressions than “possible world.” To speak of a “counter-factual” is to presuppose the factual. On these points, Hartshorne and Kripke are in full agreement.

On the question of the nature of possibility, Hartshorne sided closely with Peirce but parted ways with Whitehead. Peirce conceived the realm of possibilities as a continuum which, by definition, has no least member, but is infinitely divisible. There are no actual parts of a continuum, only an infinite number of ways to slice it. This idea is evident in Hartshorne’s concept of the affective continuum (see the companion article “Charles Hartshorne: Biography/Philosophy: Philosophy and Psychology of Sensation”). Whitehead, on the other hand, spoke of “eternal objects” as “forms of definiteness” that identify what a thing is. The point of calling eternal objects “eternal” is that none of them are novel; the point of calling them “objects” is that they are definite; for example, a particular shade of green is this shade and no other. To use Whitehead’s example, a leaf on a tree changes colors but any particular shade of color exhibited by the leaf does not change. Hartshorne maintains, by contrast, that the shades of color in question are neither eternal nor are they definite objects; put somewhat differently, they are definite only insofar as they are not eternal. The successive shades of color of the leaf are slices of the color continuum that exist as definite only when instantiated in the leaf. The color of the leaf at a particular moment is novel. In Hartshorne’s account, we speak of sameness of color because the gradation between any two shades may be so infinitesimally slight as to be imperceptible. He noted that observed sameness of color is not a transitive relation. An object X may appear to be the same color as Y and Y the same color as Z, but X may appear slightly different than Z. In other words, there is a threshold defined by a degree of separation on the color continuum below which real differences are not observable for creatures like us.

According to Hartshorne, any quality that admits of a negative instance is not eternal. There are, in short, emergent universals. In Creative Synthesis (Ch. IV), Hartshorne notes that “lover of Shakespeare” is a universal in the sense that it may be true of more than one thing but it is emergent in the sense that it could be true of nothing prior to Shakespeare. By parity of reasoning, specific qualities in the affective continuum—particular tonal qualities, particular shades of color, and so forth—emerge as the affective continuum is cut in various ways and patterns by dynamic singulars. On the other hand, the generic quality of “feeling” may be classified, in Hartshornean principles, as eternal, if not quite an “object” in the Whiteheadian sense. As previously noted, qualities that admit only of positive instances are metaphysical. A consequence of Hartshorne’s view is that similarity is not simply a function of partial identity. It is true that we count two things similar when they have a sufficient number of qualities in common. But it is also the case that qualities themselves are similar to each other, as when we observe that orange is closer to red than it is to blue. Hartshorne concludes that similarity is as metaphysically ultimate as identity.

f. The Aesthetic Motif

One of the best and earliest interpreters of Hartshorne’s philosophy, Eugene Peters, spoke of “the aesthetic motif” that runs through neoclassical theism. Peters was drawing attention to the fact that, for Hartshorne, the most inclusive values are aesthetic. Hartshorne began his career proposing, as an empirical hypothesis, that all sensations are feelings and that all feelings exist along an aesthetic continuum (see “Charles Hartshorne: Philosophy and Psychology of Sensation”). Hartshorne’s metaphysics completes and complements the empirical hypothesis by considering the value-achievement and value-enrichment of dynamic singulars as the very foundations of existence. In Divine Beauty, Daniel Dombrowski rightly says that, for Hartshorne, aesthetic experiences are not merely woven into the real, they are the real. The poet e. e. cummings wrote, “Since feeling is first / who pays any attention / to the syntax of things . . .” Hartshorne did precisely what cummings dismissed (at least in the poem): he recognized feeling as first (that is, as a metaphysical category) but he also paid close attention to the syntax of things (to understand the structure of feeling).

In his first book Hartshorne rejected the “annex view of value.” In the context of neoclassical metaphysics this means that there is no merely valueless stuff (what Whitehead called “vacuous actuality”) onto which values are projected by human or divine purposes. Our pre-reflective experiences of our bodies, our memories, and of the world are never, Hartshorne insists, of bare valueless existence. The mother hears her baby’s cries as irritating and the mother’s songs are heard as soothing by the child. The values in experience, however, are not primarily ethical but aesthetic, a fact most clearly illustrated in the animal kingdom. The experiences of subhuman creatures are productive primarily—and for most creatures, exclusively—of non-moral values. When a lion fells an antelope, it is good for the lion pride and bad for the antelope, but moral judgments are out of place. One may, it is true, stress what is adaptive in behavior and useful for the survival of the species. There remain, however, the values of living for the lions and for the antelope that derive from being aware of the world around them, of breathing, eating, and the interactions with their fellows. These creatures do not think about their worlds but they feel them. For them, aesthesis or feeling (the root of “aesthetics”) is indeed “first.” Hartshorne’s extensive study of song birds in his book Born to Sing supports this hypothesis; oscines have what in us would be called an aesthetic sense.

Hartshorne did not consider beauty to be the only aesthetic value, but “beauty” was his word of choice for what anchors his aesthetic theory. One could generalize or gloss “beauty” to mean intense satisfactory experience without distorting Hartshorne’s meaning. Much of traditional aesthetics holds that beauty is unity within diversity. Hartshorne argued, however, that another contrast is necessary to make sense of beauty, that of complexity and simplicity. This concept of beauty, along with the relation of beauty to other aesthetic values, is expressed in the Dessoir-Davis-Hartshorne Circle. (Max Dessoir and Kay Davis helped Hartshorne with the diagram.) If Hartshorne is correct, then beauty is a mean between two extremes, between order and disorder on the one hand (the vertical axis of the circles) and between complexity and simplicity on the other (the horizontal axis of the circles). Outside of the boundary of the outer circle is not merely aesthetic failure but also the failure of experience and therefore (because of Hartshorne’s psychicalism) of existence itself.

graph of undiversified unityFor Hartshorne, beauty (or any aesthetic quality) is not merely in “the eye of the beholder” (or the perception of the perceiver). One must take into account not only the perceiving mind but what the mind perceives. A mind of sufficient complexity, cultivation, and education is required to appreciate the elements that make for beauty in something. For example, until one knows what counterpoint is and until one is taught to listen for it, one is not in a position to be fully aware of it and one may not even be able to hear it. An adequate grasp of such things is beyond the ability of creatures with simpler nervous systems or of humans with certain kinds of brain damage. There is, in short, an intellectual component of beauty that requires a higher intellect to appreciate. This intellectual side of beauty predominates in science and mathematics, but Hartshorne argues that the twin contrasts of order / disorder and complexity / simplicity remain. In one of his articles, titled “Science as the Search for the Hidden Beauty of the World” (1982), he chronicles the ways in which ideals of beauty guide pure scientific inquiry and how the deliverances of science themselves are beautiful. Science seeks a proper balance between imagination (for example, theorizing) and observation. Hartshorne speaks of the “romance of science” as “the disclosure of a universe whose wild harmonies surpass the most vivid dreams of imagination not submitting itself to criticism and observational test.” He reminds us that Darwin closed the Origin with “a prose poem on the beauty of the web of life.”

Prediction is one of the goals of scientific inquiry, but even here, there is an aesthetic component. Too little predictability is chaotic but too much predictability is monotonous. Good science is also heuristic, meaning that it is fruitful, leading to more discoveries. But discoveries in the strict sense are not predictable and are often quite surprising. Hartshorne accuses the determinism of traditional Newtonian science of aesthetic failure for it posited absolute regularity as the ideal to the exclusion of spontaneity, chance, and freedom: the adventure of life reduced to mechanistic obedience to law. Hartshorne’s indeterminism, as we have seen, respects the rule of laws of nature but provides a balance between regularity and irregularity. Traditional theology, Hartshorne claims, was as defective from an aesthetic point of view as the traditional philosophy of nature. Classical theologians stressed divine simplicity and unity to the exclusion of complexity and variety. In an article titled “The Aesthetic Dimensions of Religious Experience” (1992) Hartshorne says, “The beauty of the world is in its partly unprogrammed spontaneities.” Hartshorne’s neoclassical theism affirms a world of multiple creative agents in interaction with each other and with God (see “Charles Hartshorne: Dipolar Theism”). In Hartshorne’s view, God is affected by the creatures and, consequently, the divine experience is a complex reality, full of all of the serendipity and tragedy that interactions with others routinely bring. If Hartshorne is correct, there is an ever changing beauty of the world as a whole that is fully appreciated only by deity and to contemplate this divine experience is to have something akin to what classical theologians called the beatific vision (see the discussion of Hartshorne’s aesthetic argument in “Charles Hartshorne: Theistic and Anti-theistic Arguments”).

4. Conclusion: Hartshorne’s Legacy

At an early age, after reading Emerson, Hartshorne says in his introduction to The Logic of Perfection that he resolved “to trust reason to the end.” He left ample evidence that he was true to this purpose. He was, however, sensitive to the many ways in which philosophy is a frail and fallible enterprise. Communication must take place across centuries and across cultural and linguistic boundaries. There is the snobbery and inertia of traditions and what Hartshorne called “cultural lag” in the recognition of genuine insights (“Analysis and Cultural Lag in Philosophy,” 1973). There is the tendency to forget, ignore, or marginalize objections to one’s views; Hartshorne also considered it mistaken to suppose that meeting objections is sufficient for securing the rationality of one’s ideas, or as he wrote in his correspondence with Edgar Sheffield Brightman, to merely defend one’s own “castle of ideas.” As Carnap said, it is one thing to ask what your metaphysical position commits you to, but it is something else again to ask what commits you to your metaphysical position. Despite their knowledge of formal logic, philosophers are also susceptible to the fallacy of affirming the consequent, looking only for confirmation of their views or for arguments favorable to them. There is, finally, the failure to exhaust the logically possible alternatives in considering the solutions for particular philosophical problems. Hartshorne discussed all of these obstacles, and more, to making progress in philosophy, and he took measures to remedy them in his own attempt to trust reason.

Hartshorne distinguished, with Edith Wharton, between those who light new candles and those who are mirrors reflecting the candles that are lit by others. At the close of his autobiography, he remarked that Whitehead and Peirce had done both, and he dared to hope that he had done both. Hartshorne’s own “candle” has perhaps often been missed because he expended a lot of energy reflecting the lights of Whitehead and Peirce. Hartshorne, however, was neither Whiteheadian nor Peircean. This is true not only of his range of interests and expertise—he contributed to the psychology of sensation and to the study of bird song; it is also true of his systematic presentation, development, and defense of the project of metaphysics, as well as of his own distinctive metaphysical system. He lacked for neither ideas nor for arguments to support those ideas. His neoclassical metaphysics is arguably one of the great intellectual achievements of the twentieth century.

5. References and Further Reading

a. Primary Sources: Books (In Order of Appearance)

  • Hartshorne, Charles. 1937. Beyond Humanism: Essays in the Philosophy of Nature. Chicago: Willett, Clark & Company. Republished in 1975 by Peter Smith.
  • Hartshorne, Charles. 1953. Reality as Social Process: Studies in Metaphysics and Religion. Boston: Beacon Press.
  • Hartshorne, Charles. 1962. The Logic of Perfection and Other Essays in Neoclassical Metaphysics. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1970. Creative Synthesis and Philosophic Method. La Salle, Illinois: Open Court.
  • Hartshorne, Charles. 1972. Whitehead’s Philosophy: Selected Essays, 1935-1970. Lincoln, Nebraska: University of Nebraska Press.
  • Hartshorne, Charles. 1976. Aquinas to Whitehead: Seven Centuries of Metaphysics of Religion. Milwaukee, Wisconsin: Marquette University Publications.
  • Hartshorne, Charles. 1983. Insights and Oversights of Great Thinkers: An Evaluation of Western Philosophy. Albany: State University of New York Press.
  • Hartshorne, Charles. 1984. Creativity in American Philosophy. Albany: State University of New York Press.
  • Hartshorne, Charles. 1987. Wisdom as Moderation: A Philosophy of the Middle Way. Albany: State University of New York Press.
  • Hartshorne, Charles. 1997. The Zero Fallacy and Other Essays in Neoclassical Philosophy, edited by Mohammad Valady. Peru, Illinois: Open Court Publishing Company.
  • Hartshorne, Charles. 2011. Creative Experiencing: A Philosophy of Freedom, edited by Donald W. Viney and Jincheol O. Albany: State University of New York Press.
  • Auxier, Randall E. and Mark Y. A. Davies, editors. 2001. Hartshorne and Brightman on God, Process, and Persons: The Correspondence, 1922-1945. Nashville: Vanderbilt University Press.
  • Viney, Donald W., guest editor. 2001. Process Studies, Special Focus on Charles Hartshorne, 30/2 (Fall-Winter)
  • Viney, Donald W., editor. 2001. Charles Hartshorne’s Letters to a Young Philosopher: 1979-1995. Logos-Sophia, the Journal of the Pittsburg State University Philosophical Society, volume 11. Pittsburg, Kansas.
  • Viney, Donald W., guest editor. 2011. Process Studies, Special Focus Section: Charles Hartshorne, 40/1 (Spring/Summer): 91–161.
  • Vetter, Herbert F., editor. 2007. Hartshorne, A New World View: Essays by Charles Hartshorne. Cambridge, Massachusetts: Harvard Square Library.

b. Primary Sources: Hartshorne’s Response to his Critics

  • “Interrogations of Charles Hartshorne,” conducted by William Alston. 1964. Philosophical Interrogations, edited by Sydney and Beatrice Rome. New York: Holt, Rinehart and Winston: 319–354.
  • Cobb, John B. Jr. and Franklin L Gamwell, editors. 1984. Existence and Actuality: Conversations with Charles Hartshorne. Chicago: University of Chicago Press.
  • Hahn, Lewis Edwin, editor. 1991. The Philosophy of Charles Hartshorne, The Library of Living Philosophers Volume XX. La Salle, Illinois: Open Court.
  • Kane, Robert and Stephen H. Phillips, editors. 1989. Hartshorne, Process Philosophy and Theology. Albany State University of New York Press.
  • Sia, Santiago, editor. 1990. Charles Hartshorne’s Concept of God: Philosophical and Theological Responses. Dordrecht, the Netherlands: Kluwer Academic Publishers.

c. Primary Sources: Selected Articles

  • Hartshorne, Charles. 1932. “Contingency and the New Era in Metaphysics, I.” Journal of Philosophy 29/16. 4 August: 421–431.
  • Hartshorne, Charles. 1932. “Contingency and the New Era in Metaphysics, II.” Journal of Philosophy 29/17. 18 August: 457–469.
  • Hartshorne, Charles. 1934. “The New Metaphysics and Current Problems, I.” New Frontier 1/1: 24–31; “The New Metaphysics and current Problems, II.” New Frontier 1/5: 8–14.
  • Hartshorne, Charles. 1935. “Metaphysics for Positivists.” Philosophy of Science 2/3. July: 287-303.
  • Hartshorne, Charles. 1945. Entry for “time”, pp. 787-88 in An Encyclopedia of Religion, ed. Vergilius Ferm. New York: Philosophical Library.
  • Hartshorne, Charles. 1964. “Thinking About Thinking Machines,” Texas Quarterly 7/1. Spring: 131–140.
  • Hartshorne, Charles. 1970. “The Development of My Philosophy” in John E. Smith (ed.) Contemporary American Philosophy: Second Series, London: Allen & Unwin: 211–28.
  • Hartshorne, Charles. 1973. “Analysis and Cultural Lag in Philosophy.” Southern Journal of Philosophy 11/2-3: 105–112.
  • Hartshorne, Charles. 1977. “Bell’s Theorem and Stapp’s Revised View of Space-Time.” Process Studies 7/3 (Fall): 183–191.
  • Hartshorne, Charles. 1978. “A Philosophy of Death.” Philosophical Aspects of Thanatology, volume 2, edited by Florence M. Hetzler and A. H. Kutscher. New York: MSS Information Corporation: 81–89.
  • Hartshorne, Charles. 1980. “Mysticism and Rationatistic Metaphysics.” Understanding Mysticism, edited by Richard Woods. Garden City, New York: Image: 415–421.
  • Hartshorne, Charles. 1981. “Concerning Abortion: An Attempt at a Rational View.” The Christian Century 98/2. 21 January: 42–45.
  • Hartshorne, Charles. 1982. “Science as the Search for the Hidden Beauty of the World.” The Aesthetic Dimension of Science 1980 Nobel Conference, Number 16, ed. Deane W. Curtin. New York: Philosophical Library, 1982): 85–106.
  • Hartshorne, Charles. 1987. “Mind and Body: A Special Case of Mind and Mind.” A Process Theory of Medicine: Interdisciplinary Essays, edited by Marcus Ford. Lewiston, New York: Edwin Mellen Press: 77–88.
  • Hartshorne, Charles. 1987. “A Metaphysics of Universal Freedom.” Faith and Creativity, Essays in Honor of Eugene H. Peters, edited by George Nordgulen and George W. Shields. St. Louis, Missouri: CBP Press: 27–40.
  • Hartshorne, Charles. 1988. “Some Principles of Procedure in Metaphysics.” The Nature of Metaphysical Knowledge, edited by G. F. McLean and Hugo Meynell. Lanham, New York: University Press of America: 69–75.
  • Hartshorne, Charles. 1988. “Sankara, Nagarjuna, and Fa Tsang, with Some Western Analogues.” Interpreting Across Boundaries: New Essays in Comparative Philosophy, edited by G. J. Larson and Eliot Deutsch. Princeton University Press: 98–115.
  • Hartshorne, Charles. 1989. “Von Wright and Hume’s Axiom.” The Philosophy of Georg Henrik von Wright, edited by Paul Arthur Schilpp and Lewis Edwin Hahn. La Salle, Illinois: Open Court: 59–76.
  • Hartshorne, Charles. 1990. “Hegel, Logic, and Metaphysics,” CLIO 19/4: 345–352.
  • Hartshorne, Charles. 1991. “An Open Letter to Carl Sagan.” The Journal of Speculative Philosophy 5/4: 227–232.
  • Hartshorne, Charles. 1992. “The Aesthetic Dimensions of Religious Experience.” Logic, God and Metaphysics, ed. J. F. Harris. Dordrecht: Kluwer Academic Publishers: 9–18.
  • Hartshorne, Charles. 1993. “Can Philosophers Cooperate Intellectually: Metaphysics as Applied Mathematics.” The Midwest Quarterly 35/1. Autumn: 8–20.
  • Hartshorne, Charles. 1994. “Three Important Scientists on Mind, Matter, and the Metaphysics of Religion.” The Journal of Speculative Philosophy, 8/3: 211–227.

d. Secondary Sources

  • Chancey, Anita. 1999. “Rationality, Contributionism, and the Value of Love: Hartshorne on Abortion.” Process Studies 28/1-2. Spring-Summer: 85–97.
  • Dombrowski, Daniel A. 1988. Hartshorne and the Metaphysics of Animal Rights. Albany: State University of New York Press.
  • Dombrowski, Daniel A. 2004. Divine Beauty: The Aesthetics of Charles Hartshorne. Nashville, Tennessee: Vanderbilt University Press.
  • Easterbrook, Gregg. 1998. “A Hundred Years of Thinking About God, A Philosopher Soon to be Rediscovered,” U.S. News & World Report. February 23: 61, 65.
  • Fitzgerald, Paul. 1972. “Relativity Physics and the God of Process Philosophy.” Process Studies 2/4. Winter: 251–276.
  • Ford, Lewis S. 1968. “Is Process Theism Compatible with Relativity Theory?” Journal of Religion 48/2. April: 124–135.
  • Ford, Lewis S., editor. 1973. Two Process Philosophers: Hartshorne’s Encounter with Whitehead. Tallahassee, Florida: American Academy of Religion.
  • Griffin, David Ray, John B. Cobb Jr., Marcus P. Ford, Pete A. Y. Gunter, and Peter Ochs. 1993. Founders of Constructive Postmodern Philosophy: Peirce, James, Bergson, Whitehead, and Hartshorne. Albany: State University of New York Press.
  • Jesse, Jennifer G. and J. Wesley Robbins, editors. 2001. American Journal of Theology & Philosophy, memorial issue in tribute to Charles Hartshorne, 22/2. May.
  • Minor, William S., editor. 1969. Charles Hartshorne and Henry Nelson Wieman. Lanham, MD: University Press of America.
  • Myers, William, guest editor. 1998. The Personalist Forum, Special Issue on Charles Hartshorne, 14/2. Fall.
  • Peters, Eugene H. 1970. Hartshorne and Neoclassical Metaphysics. Lincoln: University of Nebraska Press.
  • Peters, Eugene H. 1976. “Philosophic Insights of Charles Hartshorne,” Southwestern Journal of Philosophy, VII, 1/17: 157–170.
  • Ramal, Randy, editor. 2010. Metaphysics, Analysis, and the Grammar of God: Process and Analytic Voices in Dialogue .Tübingen, Germany: Mohr Siebeck.
  • Reck, Andrew J. 1961. “The Philosophy of Charles Hartshorne,” Tulane Studies in Philosophy X. May: 89–108.
  • Reese, William L. and Eugene Freeman, editors. 1964. Process and Divinity: Philosophical Essays Presented to Charles Hartshorne: The Hartshorne Festchrift. La Salle, Illinois: Open Court Publishing Company.
  • Shields, George W. 1992. “Infinitesimals and Hartshorne’s Set-Theoretic Platonism” The Modern Schoolman 49/2. January.
  • Shields, George W. 2004. “Process and Universals” in After Whitehead: Rescher on Process Metaphysics, ed. by M. Weber. Frankfurt: Ontos Verlag.
  • Shields, George W. 2008. “‘Beyond Enlightened Self-Interest’ Revisited: Process Philosophy and the Biology of Altruism” in Researching with Whitehead: Essays in Honor of John B. Cobb, Jr., ed. by F. Riffert and Hans-Joachim Sander. Muenchen: Verlag Karl Alber.
  • Shields, George W. 2008. “MWI Quantum Theory: Some Logical and Philosophical Issues,” paper presented at the Center for Philosophy and Natural Sciences, California State University-Sacramento.
  • Shields, George W. 2009. “Quo Vadis?: On Current Prospects for Process Philosophy and Theology,” The American Journal of Theology & Philosophy, 30/2. May.
  • Shields, George W. 2010. “Eternal Objects, Middle Knowledge, and Hartshorne: A Response to Malone-France,” Process Studies, 39/1. Spring/Summer: 149–165.
  • Shields, George W. 2010. “Panexperientialism, Quantum Theory, and Neuroplasticity” in Process Approaches to Consciousness, eds. Michel Weber and A. Weekes. Albany: State University of New York Press.
  • Shields, George W., editor. 2003. Process and Analysis: Whitehead, Hartshorne, and the Analytic Tradition. Albany: State University of New York Press.
  • Simoni-Wastila, Henry. 1999. “Is Divine Relativity Possible? Charles Hartshorne on God’s Sympathy with the World.” Process Studies 28/1-2. Spring-Summer: 98–116.
  • Sprigge, T. L. S. 2006. The God of Metaphysics. Oxford: Clarendon Press.
  • Suchocki, Marjorie Hewitt and John B. Cobb, Jr. editors. 1992. Process Studies, Special Issue on the Philosophy of Charles Hartshorne, 21/2. Summer.
  • Viney, Donald Wayne. 2008. “Charles Hartshorne (1897-2000),” Handbook of Whiteheadian Process Thought, Volume 2, edited by Michel Weber and Will Desmond. Frankfurt / Paris / Lancaster: Ontos Verlag: 589–596.
  • Viney, Donald Wayne and Rebecca Viney. 1993. “For the Beauty of the Earth: A Hartshornean Ecological Aesthetic.” Proceedings of the Institute for Liberal Studies: Science, Technology & Religious Ideas, volume 4. Frankfort: Kentucky State University: 38–44.
  • Whitehead, Alfred North. 1978 [1929]. Process and Reality: An Essay in Cosmology, corrected edition, edited by David Ray Griffin and Donald W. Sherburne. New York: Free Press.
  • Wilcox, John T. 1961. “A Question from Physics for Certain Theists.” Journal of Religion 40/4. October: 293–300.

e. Bibliography

“Primary Bibliography of Philosophical Works of Charles Hartshorne” (compiled by Dorothy Hartshorne; corrected, revised, and updated by Donald Wayne Viney and Randy Ramal) in Herbert F. Vetter, editor, Hartshorne: A New World View: Essays by Charles Hartshorne. Cambridge, Massachusetts: Harvard Square Library, 2007: 129–160. Also published in Santiago Sia, Religion, Reason and God. Frankfurt am Main: Peter Lang, 2004: 195–223.

Author Information

Donald Wayne Viney
Email: don_viney@yahoo.com
Pittsburg State University
U. S. A.

and

George W. Shields
Email: George.shields@kysu.edu
Kentucky State University
U. S. A.

Political Constructivism

Political Constructivism is a method for producing and defending principles of justice and legitimacy. It is most closely associated with John Rawls’ technique of subjecting our deliberations about justice to certain hypothetical constraints. Rawls argued that if all of us reason in the light of these conditions we could arrive at the same judgment about justice. Moreover, our shared judgment about justice is justified precisely because it resulted from a suitably structured deliberative process. This is constructivism’s key idea; it holds that certain complex entities are constructed from more fundamental elements.

In moral and political constructivism, the complex entities are moral and political principles or obligations, such as the principle to each according to his merits or the obligations created through contracts. The debates surrounding constructivism tend to concern the nature of these elements and the process by which they get assembled. Some constructivists are more subjective insofar as they cast these elements as attitudes and values of living agents or as the settled political values of a particular society. Others are more objective insofar as they identify these elements with universal precepts of practical reason working in combination with abstract conceptions of persons and society. In each case, the constructivist holds the view that these elements—no matter how they are specified—are brought together in a set of reasons favoring one principle over another. The process by which this happens is a process of construction, since the human mind actively assembles the considerations from which a principle is formulated; it does not passively receive its formulation. Absent this active mental process, there are no criteria for guiding political action or justifying our political institutions—neither a way to properly assess our genuine political obligations. In order to perform these evaluative tasks, we must construct the metric of assessment. Political constructivism is a philosophical account for how this constructing happens, and how the process confers moral authority on the resulting principles.

Table of Contents

  1. Introduction
  2. A Brief History
  3. Political Constructivism: Two Formulations
  4. Political Constructivism and Procedures
  5. Political Constructivism and Social Problems
  6. Conclusion
  7. References and Further Reading

1. Introduction

The term “constructivism” is still relatively new to political and moral theory. It emerged sometime in the second half of the twentieth century to describe John Rawls’ general approach to normative political theory. Since first appearing, it has developed into a family of positions in normative ethics, political philosophy, and metaethics. The term “political constructivism” is newer still and sometimes used to describe the approach Rawls employed in Political Liberalism, which attempts to steer clear of any controversial metaphysical suppositions by drawing heavily on the ideals and values implicit in a democratic society. More generally, it is used to describe the application of constructivism to the political domain. On this general understanding, political constructivism not only covers all of Rawls’ political works, but any political work guided by the idea that an appropriate thought process confers authority onto the resulting political principles. Moreover, since human thought creates the political principles governing our society, human thought can analyze those same principles and either affirm or refute their justification. The fact that we can analyze our principles—and by extension the policies based on them—suggests that we can reason about politics, and the constructivist maintains that our reasons should go a long way toward reconciling political debate and generating agreement in judgment.

This general idea of political constructivism is not too different from other, more familiar views, such as the claim that appropriate prices are the result of open and competitive markets, or the idea that legitimate representatives of a democratic society are the winners of open and fair elections. In each of these cases, the entity in question can be explained in terms of a more fundamental process, for instance, the decisions of various people engaged in markets or electoral processes, together with an explanation of what it means for these processes to be ‘open,’ ‘competitive,’ or ‘fair.’ Political constructivism reflects a similar idea insofar as principles of political action result from a thought-process involving elements more fundamental than the principles themselves, such as attitudes, concepts, ideals, beliefs, values, and precepts. Together, these building blocks help establish a particular set of principles as justified, appropriate, objective, or valid. As a result, constructivism is the view that the best set of political principles is the outcome of an appropriate form of thinking. Importantly, there is no criterion beyond this form of thinking by which we can assess the appropriateness of the principles.

Although constructivism begins with a simple idea, its conception of thinking, or practical reason, is ambitious. We can see this by contrasting it with two competing conceptions of practical deliberations that are more familiar in everyday experience. The first frames practical deliberations within a means-ends relationship whereby practical reason identifies the means by which certain ends are realized. For example, every day we are led by reason to conclude that eating certain foods will satisfy our hunger. In this case, the end to be achieved is immediately given by our natural desire for food; our reason simply discovers the means for satisfying that desire. A second view of practical reason is concerned not merely with identifying the means to some immediately given end, but also with ensuring that the means or action conforms to some moral principle. For example, we may prefer to lie about a particular event, but because we are committed to a principle of honesty, we tell the truth. On this view, the capacity of practical reason extends beyond its instrumental role by including within it a power to check our impulses against moral principles. Notice, however, that in this second example, the rule on which our action is based is still given. There is no implicit or explicit claim that practical reason produces the principle. Instead, practical reason passively receives its command and acts within its limits.

Political constructivism is a different view altogether. The various political principles constraining political action are not merely given to us, but rather are the products of thought. They are not products in the sense of being created from nothing, but rather constructed from various resources appropriate to political argument. Apart from these constructions, there are no moral facts or true moral judgments, nor are there ways of assessing the moral worth of a political action. It is only when deliberations are properly constrained that the resulting outcome is a principle against which our actions can be assessed as morally right or wrong.

Notice that while our political actions are assessed against a normative principle, there is no criterion beyond the deliberative process by which the rightness of the principle is assessed; it is authoritative in virtue of being the outcome of a certain kind of deliberative process or a certain form of argument. Consequently, the challenge for constructivism is to explain the appropriateness of the process without appealing to any judgment that is supposed to derive from that process; for if the thought process relied on such a judgment to assess its appropriateness, it would assume the very thing it claims to construct. It has been argued that constructivism fails to meet this challenge on logical grounds (Cohen 2008). But others have attempted to meet it, and in turn have created a variety of interpretations. This makes the approach difficult to define and summarize. Naturally, a great deal of philosophical debate surrounds the appropriateness of the deliberative process, especially as it concerns the metaethical themes of justification and objectivity. At any rate, despite the extensive literature on the subject, there are two general formulations of political constructivism influenced by two historical accounts of practical reasons. The first is a deontological account of practical reason that is primarily associated with Kantian ethics. The second is a teleological account of practical reason that is primarily associated with social contract theory. The Kantian and social contract traditions, although offering differing accounts of practical reason, share much in common, and it would not be an exaggeration to cast constructivism as a contemporary attempt to explain the Rousseauian idea of moral freedom as acting on a law one gives oneself complemented by the Kantian idea that the law one gives oneself is out of one’s reason. Political constructivism tries to make this idea clear by identifying a compelling form of normative political analysis with easily understood criteria for thinking about political issues. The hope is that once we are equipped with this form of analysis, we can reason in the light of these criteria and reach agreement in political judgment; and, if not agreement, we can at least narrow our differences sufficiently to secure a just, or fair, or honorable, or decent set of political relations (Rawls 1993, 120).

This article frames political constructivism as a general way of applying constructivism to the political domain. It discusses various interpretations in light of the two general formulations noted above—deontology and teleology. Although the various interpretations discussed do not always fit easily within this distinction, it is nevertheless a useful way of examining political constructivism because deontology and teleology straddle a historical fault line for how best to think about practical reason and the justification of political principles. According to the deontological approach, practical reason is modeled on a mathematical deduction; the aim is to create an argument that should be, so far as possible, a deductive one. By contrast, a teleological account of practical reason has an instrumental form; the aim is to explain how political principles function to realize some end. Examining political constructivism in the light of these formulations exposes the key logical difference between the various interpretations of political constructivism and sets the stage for assessing whether a particular interpretation is more favorable than others.

2. A Brief History

Although the historical influences of constructivism date back to the social contract and Kantian traditions, the contemporary usage of the term seems to have originated with Ronald Dworkin’s 1973 article, “The Original Position” (Dworkin 1973). In this article, Dworkin defends a constructive model of Rawls’s reflective equilibrium over a natural model. Reflective equilibrium refers to a strategy for justifying political principles often associated with political constructivism. The important point here is that a natural model views the relation between moral principles and our more intuitive judgments about ethics as analogous to the relation between scientific laws and empirical data. On a natural model, political theory aims at discovering and describing the normative laws that explain our moral intuitions, not unlike the way natural science aims at discovering and describing the laws of nature that explain our sensory intuitions about the world outside of us. By contrast, a constructive model presents political theory as analogous to legal theory. On this view, political theory aims at constructing political principles that can account for our moral intuitions by bringing as many of those intuitions into a coherent whole with one another, not unlike a judge who, on deciding a case, constructs a legal principle that brings precedent into a coherent whole with a novel yet plausible interpretation of a legal concept.

Dworkin’s constructive model captures a feature of constructivism that has endured until the twenty-first century, namely, that political principles depend on us; they are mind-dependent and result from some interpretive work on our part. To bring this feature into sharp relief, Dworkin contrasts a constructive model to a natural model, again foreshadowing a move familiar in the literature; for it is often the case that constructivists contrast their positions to moral realism when developing their arguments. Moral realism takes many forms, but a common feature of moral realism is that it frames moral judgments in terms of our detecting moral facts. We discover these moral facts not unlike the way we discover the color red—we passively receive the datum. Moreover, these entities are simple in that they cannot be analyzed any further; there are no fundamental elements brought together into a set of consideration from which political principles are formulated. Constructivism differs with moral realism on these various points. In contrast to realism, constructivism holds that actions are judged as right or wrong by measuring those actions against principles that are themselves constructed, not detected. Moreover, these principles are justified in virtue of being constructed from more fundamental elements through an appropriate thought process.

The first major attempt to explicitly develop a constructivist position was John Rawls’ “A Kantian Constructivism in Moral Theory,” first published in 1980.  Prior to “A Kantian Constructivism,” Rawls used the metaphor of a ‘contract’ rather than ‘construction’ to describe his theory, tending instead to use the adjective constructive to mean capable of settling moral disputes. “A refutation of intuitionism,” he writes, “consists in presenting the sort of constructive criteria that are said not to exist” (Rawls 1999a, 35). The aim of A Theory of Justice is to defend precisely these constructive criteria. In “Kantian Constructivism,” Rawls introduces the term constructivism without explanation and uses it to denote a particular kind of political argument reflecting a particular view about justification and objectivity. He writes:

Kantian constructivism holds that moral objectivity is to be understood in terms of a suitably constructed social point of view that all can accept. Apart from the procedure of constructing the principles of justice, there are no moral facts. Whether certain facts are to be recognized as reasons of right and justice, or how much they are to count, can be ascertained only from within the constructive procedure, that is, from the undertakings of rational agents of construction when

suitably represented as free and equal moral persons (Rawls 1999b, 307).

This suggests a two-step process: (1) constructing a social point of view acceptable to all, and (2) constructing principles of justice from within that point of view. Somewhere between the publication of “Kantian Constructivism” and Political Liberalism, published in 1993, Rawls decides to express himself differently. When explaining political constructivism, Rawls clarifies what he takes to be constructed:

First, in this form of constructivism, what is it that is constructed? Answer: the content of a political conception of justice. In justice as fairness this content is the principles of justice selected by the parties in the original position … A second question is this: as a procedural device   of representation, is the original position itself constructed?  No: it is simply laid out (Rawls 1993, 103).

The two-step construction noted above is now reduced to one step, namely, constructing the principles of justice. The social point of view from which the construction takes place is not constructed, but simply laid out. The varieties of constructivism that follow the publication of Rawls’ “Kantian Constructivism” represent competing views on how these separate tasks are to be conducted. While each variant lays out a social point of view and defends competing principles, they have in common the basic idea that the appropriate set of political principles is the outcome of an appropriate form of thinking. Moral judgments are correct when they conform to these principles. The task of political argument is to join together all the relevant elements into one unified scheme of practical reason, that is, a social point of view, so that the deliberations constrained by that scheme arrive at—or construct—the proper principles of justice. Absent this scheme, there are no criteria for guiding political action or justifying our institutions.

And so, in 1980, constructivism begins to take shape as a distinctive approach to moral and political theory. In the decades following “Kantian Constructivism,” the literature on constructivism proliferates at an increasingly fast rate, and an increasing percentage of the literature focuses on moral constructivism rather than political constructivism. The publications of T. M. Scanlon, Christine Korsgaard, and Onora O’Neill begin to form a body of work that, together with John Rawls’, shapes key debates in normative ethics and political philosophy as well as in metaethics.

3. Political Constructivism: Two Formulations

The central idea behind political constructivism is that an appropriate set of political principles is constructed from suitably formed deliberations. These deliberations assemble fundamental elements—such as attitudes, concepts, ideals, beliefs, values and precepts, along with their application to certain problems or contexts in which our normative deliberations take root—into a set of reasons from which principles are formulated. This is an abstract idea that needs to be filled out with some content if it is to be fully understood. The most famous and substantial formulation of it is John Rawls’ theory of justice, which he calls justice as fairness. Justice as fairness begins with a simple idea: the most appropriate conception of justice is one that people would choose in a fair situation (Rawls 1999b, 310). A fair situation is a hypothetical choice procedure called the original position. It organizes various concepts, considered judgments, and precepts into a procedure that frames deliberations. Anyone deliberating within this procedure will reason according to these elements of rationality and reasonableness. In other words, these building blocks provide the raw material from which principles of justice are constructed.

What are these starting points? They include common precepts of rationality, such as: If one desires a particular end, it is rational to follow the means for achieving that end; if the end can be realized in more than one way, it is rational to choose the less burdensome way; if agreements between parties are mutually beneficial and each party can be given full assurance that the other will abide by the terms of the agreement, it is rational to enter into the agreement; if times are uncertain, it is rational to rank alternatives by their worst possible outcome and then pick the alternative with the least worst outcome. These precepts of rationality are guided in their application toward a particular set of ends called primary goods. In addition to these precepts of rationality and their related ends, the original position attempts to model precepts of reasonableness. Reasonable people are ready to propose principles as fair terms of social cooperation and to abide by them willingly, even at the cost of their own interests in particular situations, provided that others accept those terms. Rawls models reasonableness into the original position by including within it a veil of ignorance that precludes parties from knowing their specific circumstances: a condition of publicity that ensures parties understand the public nature of the agreement; a symmetric positioning of the parties’ situation with respect to one another; formal constraints of generality and universality; and, a list of traditional principles from which the parties choose (Rawls 1999a, 105–130).

The precepts of rationality and reasonableness are modeled as a thought procedure anyone can enter into at any time. In A Theory of Justice, Rawls argues that anyone deliberating from within the original position will arrive at the same conclusion—they will choose the same two principles of justice. As a result, the original position realizes the general aim of constructivism by bringing together abstract precepts of rationality with a conception of persons and society in a set of reasons that supports a particular set of principles. Indeed, Rawls’s procedural argument is so well known and so well developed that constructivism is often taken to be synonymous with the idea that whatever results from a hypothetical thought experiment, such as the original position, constitutes the correct set of principles. For example, some describe the constructivist as a hypothetical proceduralist. “He endorses some hypothetical procedure as determining which principles constitute valid standards of morality” (Darwall, Gibbard, and Railton 1992, 140). Similarly, Brian Barry defines constructivism as “a theory to the effect that what comes out of a certain kind of situation is to count as just” (Barry 1991, 266). Sharon Street says that the bumper sticker slogan of constructivism is “no normative truth independent of the practical point of view” (Street 2010, 366). The works of T. M. Scanlon deepen the characterization of constructivism as a form of proceduralism, and critics have further solidified this interpretation by fixing on various weaknesses of procedural arguments. The combined effect is that proceduralism has become the default interpretation of political constructivism.

Proceduralism has taken many forms since the publication of A Theory of Justice. For example, Rawls’ later works use complex conceptions of persons and society to give the original position a more substantive form (Rawls 1993, 93). Since these conceptions are informed by the shared public values of a democratic society, the starting points of construction are more substantive than those identified by A Theory of Justice. Indeed, many of the debates and criticisms of a procedural formulation of political constructivism center on whether the starting points should be more universal and objective, as in A Theory of Justice, or more local and substantive, as in Political Liberalism.

The procedural interpretation of political constructivism is by far the most common, but it is not the only one. A second, less developed account is already present in “A Kantian Constructivism” where Rawls draws a link between the original position and the practical task of political argument. Rawls begins his article in a manner consistent with a procedural formulation by noting that “What distinguishes a Kantian form of constructivism is essentially this: it specifies a particular conception of the person as an element in a reasonable procedure of construction, the outcome of which determines the content of the first principles of justice” (Rawls 1999b, 304). However, he quickly adds that the Kantian conception of justice is meant to address an impasse in our recent political history, namely, “the apparent conflict between freedom and equality in a democratic society” (Rawls 1999b, 305). This impasse, and the attempt to break it, impacts the argument’s logical structure, for principles are now justified in virtue of their breaking the impasse rather than in virtue of being the outcome of a choice procedure. Constructivism becomes “political” not because it appropriates political values, but because it engages in a practical enterprise of solving political problems. Christine Korsgaard expresses this idea when she writes, “Rawls, like Hobbes before him, thinks that justice is the solution to a problem” (C. Korsgaard 2003, 112). On this formulation, justice as fairness is justified if it solves the conflict between freedom and equality in a democratic society. If it does not solve the problem, it is unjustified.

This second account of constructivism might be called the practical formulation of constructivism. Together with the procedural account, political constructivism reflects two great traditions of moral and political thought—Kantian ethics and social contract theory. Like the procedural formulation of constructivism, Kant employed the Categorical Imperative to determine whether subjective maxims are universalizable and thus objectively valid. The Kantian Categorical Imperative specifies a moral point of view that might be described as “suitably joining together all the requirements of our (human) practical reason, both pure and empirical, into one unified scheme of practical reasons” (Rawls 1999b, 515). This scheme guides deliberations so as to construct correct moral judgments. By contrast, the social contract tradition identifies the state of nature as a structural problem in need of rectification. It is the nature of the problem that frames deliberations on the content of the contract. Once the contract is established and agreed upon, it places new obligations upon the contracting parties, thereby constructing a moral order that had previously not existed.

The development of constructivism over the past three decades reflects these two traditions. Sometimes the particular variant of constructivism emphasizes one tradition over the other; sometimes it trades on both. In any case, a critical division between the variants concerns whether the constructed set of principles are formulated and justified independently of any conception of the good those principles might later realize, or whether the constructed set of principles are formulated and justified in relation to the good those principles might later realize. The former is deontological; the latter is teleological. Procedural formulations are typically deontic in that they are fashioned on mathematical proofs that move from widely acceptable axioms to more substantial political theorems. Practical formulations tend to be teleological insofar as the practical analysis is guided by the good that would be realized should that problem be resolved.

The procedural and practical formulations of constructivism serve as two entry points for understanding how political constructivism has been applied and might further be developed. Whether one formulation proves more successful depends on whether one can make more sense of the idea that the best political action is an action conforming to a normative law we give ourselves out of reasons we all can share. Only that variant will be “constructive” in the sense of being “capable” of settling moral charged political disputes. Or, absent the ambitious goal of actually settling disputes on reasons we all can share, the successful variant should at least fix the point at which political disagreements arise by bringing out into the light of day the reasons why people arrive at political judgments that are not only different but are in fact incommensurable.

4. Political Constructivism and Procedures

One way to describe the procedural formulation of political constructivism more thoroughly is to recall that constructivism can be characterized as a view about the nature of political argument or analysis, especially as it pertains to justification and objectivity. If political principles are to be justified as obligatory and morally authoritative, it is insufficient to derive them from a social point of view without also explaining why that social point of view is also authoritative; for absent a defense of the point of view, the purported justification of principles will appear wanting. In the course of time since Dworkin introduced the term, political philosophers have developed three general strategies for defending the elements of a procedure. They include reflective equilibrium, narrowing the scope of the investigation, and, at its most ambitious, elucidating the demands of practical reason from which normative political principles can be established.

Reflective equilibrium refers to a back and forth process that seeks coherence among the different parts of a conception of justice. These parts include the principles of justice, the conditions of the hypothetical procedure, and the firm moral judgments we make in everyday life. Once equilibrium is achieved, the different parts of the theory are justified in terms of their mutual support. The “key idea underlying reflective equilibrium is that we test various parts of our system of moral beliefs against other beliefs we hold, seeking coherence among the widest set of moral and non-moral beliefs by revising and refining them at all levels” (Daniels 1996, 2). Accordingly, the fundamental elements comprising a hypothetical procedure are justified in virtue of their supporting and being supported by the match between the outcome of the procedure (the principles of justice) and our firmly held moral intuitions, which Rawls calls considered judgments. “By going back and forth,” Rawls writes, “sometimes altering the conditions of the contractual circumstances [hypothetical procedure], at others withdrawing our [considered] judgments and conforming them to principle, I assume that eventually we shall find a description of the initial situation that both expresses reasonable conditions and yields principles which match our considered judgments duly pruned and adjusted.  This state of affairs I refer to as reflective equilibrium” (Rawls 1999a, 18).

Critics have raised tough questions about a coherentist justification of political principles; for if our intuitive moral judgments form part of the justificatory process, then the resulting principles cannot serve as independent standards against which those same judgments can be assessed and found wanting (Hare 1973, 147; Nagel 1973, 228; Sandel 1998, 49). The risk of circular reasoning slips into the process and thus undermines its justificatory force. To strengthen the critical dimension of the resulting political principles, procedural constructivists have tended to move in either one of two directions. They have either conceded moral breadth in order to strengthen the justificatory core of constructivism by narrowing the scope of its investigation (James 2012; Roberts 2007), or they have refocused their attention on accounts of agency and rationality in order to more clearly elucidate the demands of practical reason (O’Neill 1996).

Rawls’ writings subsequent to A Theory of Justice can be interpreted as taking the former path. In these works, he paid increasingly close attention to liberal values by linking justification to “our deeper understanding of ourselves and our aspirations,” and bracketing “claims about the essential nature and identity of persons” (Rawls 1999b, 306–07, 388). The conditions of the original position are therefore conditions already accepted by members of a liberal democracy, or conditions such members could be made to accept because of their implicit presence within the public culture of a democratic society they share. The hope is that by localizing practical reason to a particular kind of political tradition one can simultaneously strengthen the justification of the argument for that audience. There is no attempt to provide a comprehensive normative political argument true for all peoples at all times. Instead, the program is much more modest, relying on values already at home in the subject addressed.

This strategy has been criticized on a number of grounds. Some have questioned the veracity of Rawls’s empirical claims, others worry that the search for stability within a pluralist society lowers the bar of justification too much, and still others claim that Rawls’ conception of persons remain too ideal and detached from reality (Klosko 1993; Barry 1995; O’Neill 1996). Although these criticisms are forceful objections of the usual interpretation of Rawls’ Political Liberalism, Aaron James has developed a variant of the strategy less susceptible to them. James describes political constructivism as “a methodology of substantive justification… The hope is to show, as though by something vaguely akin to mathematical demonstration, that proposed principles can be worked out, in steps which are themselves manifestly reasonable, from rudimentary and highly plausible ideas arising from within a society’s own essentially social kind of practical reason” (James 2013, 251–52). The aim is to “justify principles that tell us how existing versions of the practice would have to be reformed if they are to be justifiable” (James 2012, 29). If practices such as constitutional democracies or global free trade regimes are not inherently unjust, then this could be an attractive path to pursue, since the fundamental elements from which principles are constructed are contained within the practice itself. These elements include the practice’s participants, its purpose, and the circumstances favorable to its continuation over time. Provided the description of the practice is accurate and generally acceptable, the argument in favor of a particular set of principles should be authoritative to that practice.

One criticism of this strategy is that it turns a contingent empirical fact into an absolute constraint on ones conception of justice, thereby undermining that conception’s critical leverage by rendering it ill equipped to determine why these practices might fall short of justice (Valentini 2011, 412). This becomes apparent in Rawls’ Law of Peoples, which “sets out guidelines for a liberal society’s foreign policy in a reasonably just Society of Peoples” (Rawls 2001, 128). The concern is that a state-centric global order (or peoples-centric global order) lends itself to certain injustices because there are no overarching institutions that can foster trust and cooperation among nations. The Law of Peoples fails to shed light on the unwelcomed incentives created by a state-centric order, since it assumes from the beginning that the practice is normatively innocuous and, as a result, risks justifying an unjust, or less than fully just, status quo.

In order to avoid this outcome, one would have to either attach the fundamental elements of construction to the good realized by the practice in question, or move in the opposite direction by recovering the more abstract features of practical rationality. The former option shifts the grounds of justification toward a teleological structure of justification, which is associated with a practical formulation of constructivism discussed in the next section. The latter option alone remains within the framework of a deontological justification, and is perhaps best illustrated by some of the work of Onora O’Neill. O’Neill’s constructivism abstracts from our more richly idealized conception of persons by articulating more meager—and thus what she believes to be more easily justifiable—precepts of rationality, agency, and mutual independence (O’Neill 1988). For example, O’Neill maintains that rationality can be construed as the capacity to understand and follow some form of social life; and mutual independence can be interpreted as an agent’s capacity to develop varying sorts and degrees of dependency and interdependency. These elements help frame the question: What principles can a plurality of agents of minimal rationality and with varying degrees of dependence live by? While these minimal, formal requirements of rationality and agency might be too meager to construct substantive principles of justice entitling people to certain goods, O’Neill thinks they can inform us as to which principles a group cannot live by. The elements of construction therefore help us construct principles of obligation prohibiting those actions that undermine the capacity for agency.

O’Neill’s variant reclaims the moral breadth and universality of normative political principles by constructing them from fundamental elements that are generally weak and widely acceptable. She believes every rational person can understand and accept these fundamental elements and thus can agree on the obligations constraining their actions. Rawls suggests something similar in A Theory of Justice. Like O’Neill, Rawls thinks that the justification of justice as fairness is in part defended on generally shared and preferably weak conditions. Moreover, his published articles leading up to A Theory of Justice have been described as beginning with “as narrow and morally neutral a conception of rational agency as can plausibly be drawn” (Wolff 1977, 13). The ambition reflected in these works concerns the derivation of substantive principles from formal premises through a kind of rational choice bargaining game. O’Neill can be interpreted as developing a similar position by returning to these earlier ambitions, albeit not in the language of rational choice theory. Together with the more descriptively rich practice-based variant suggested by James, the procedural formulation of constructivism can be characterized as moving in either an abstract, more universal direction, or a substantive, more localized direction. Some have tried to bridge the two ends of the spectrum by suggesting various levels of construction. For example, Peri Roberts argues that primary constructions start from bare concepts of persons and society and formulate general principles of justice with universal scope (Roberts 2007). However, once armed with these bare concepts and general principles, the constructivist can thicken the concepts in a secondary procedure by drawing on the ideals and values of a particular society.

What is common to each of these arguments is their form. Each aims to construct an argument modeled on a mathematical demonstration. The hope is to move from generally weak and broadly acceptable axioms to more substantial political theorems via a procedure of construction. The strength of this form of constructivist argument depends not only on the plausibility of the procedure, but also on whether the appropriateness of the procedure can be specified without appealing to the kinds of normative judgments the procedure is supposed to produce; for if the appropriateness of the procedure depended on such judgments, it would assume the very thing it claims to construct. G. A. Cohen has argued that Rawls’ constructivism fails to meet this challenge because the two principles resulting from the original position depend for their justification on unarticulated background principles of justice (Cohen 2008). Cohen’s argument is based on a deeper thesis about the relationship between facts and principles. On Cohen’s view, “a principle can respond to (that is, be grounded in) a fact only because it is also a response to a more ultimate principle that is not a response to a fact” (Cohen 2008, 229). This is a logical argument. If it is correct it strikes a notable blow against the constructivist position; for if the procedure reflects factual considerations, as they often do, then Cohen can maintain that anyone affirming a principle resulting from the procedure must also affirm a more fundamental principle surviving denial of those same facts. These fact-insensitive principles are the valid principles of justice; they are logically prior to the principles generated by a procedure and thus are not themselves constructed.

Cohen’s criticism is directed against John Rawls, but it applies to any form of constructivism that uses facts about persons and society when formulating the procedure. The general idea, already reflected in a number of other criticisms of constructivism, is that the process of constructing substantive normative principles relies upon unarticulated, non-constructed principles. Consequently, the constructivist cannot maintain the view that all political principles are constructed.

5. Political Constructivism and Social Problems

In A Theory of Justice, Rawls writes: “The theory of justice is a part, perhaps the most significant part, of the theory of rational choice” (Rawls 1999a, 15). Describing A Theory of Justice as a rational choice theory is less common than it used to be. In his Reconstruction and Critique of A Theory of Justice, Robert Paul Wolff speculates that Rawls’ “original intention must have been to write a book very much like Kenneth Arrow’s Social Choice and Individual Values” (Wolff 1977, 4). Wolff then continues to interpret Rawls’s working terms of a rational choice model. Similarly, in his 1989 treatise, Theories of Justice, Brian Barry interprets Rawls’ argument from two perspectives: a rational choice model and a competing approach Barry calls justice as impartiality. A rational choice characterization of A Theory of Justice views the participants of the original position as engaged in a bargaining contract concerning political principles. The failure to establish an agreement returns a person to her position or holdings prior to any cooperative arrangement, and this position is called the noncooperative baseline. Now, it is assumed that the parties are rationally motivated by their own self-interests to move beyond the noncooperative baseline and arrive at a mutually advantageous arrangement. In rational choice theory, the most mutually advantageous series of outcomes is referred to as the Pareto Frontier. It represents a series of efficient outcomes insofar as it is not possible to move away from the frontier so as to improve one person’s position without worsening another’s. The deliberations within the original position represent a move away from a noncooperative baseline to a specific point on the Pareto Frontier.

Rawls would later regret having described his theory as part of a rational choice theory, calling it a very misleading error (Rawls 1999b, 401). Nevertheless, what is particularly interesting about a rational choice characterization of A Theory of Justice is that it reflects, to some extent, the two different formulations of constructivism. On the one hand, rational choice models embody the rigor and certainty of mathematical demonstrations insofar as substantive conclusions are thought to derive from premises that, though not formal, are generally weak and widely acceptable. The procedural formulation of constructivism reflects this mathematical model. On the other hand, rational choice models are often described as solutions to problems cast as bargaining games. If the bargaining game concerns the problem of justice—or how the benefits and burdens of social cooperation are to be divided among people conceived as free and equal—then the problem itself contains normative resources for constructing the principles that will serve as the solution. The practical formulation of constructivism reflects this key idea. Notice that the two formulations locate the resources for constructing political principles in different places. The procedural account locates the fundamental elements in generally weak and broadly acceptable ideas and precepts. These building blocks are articulated independently of the good they may help bring about when assembled into principles and applied to the situation. Conversely, the practical account locates the fundamental elements of construction in relation to the good realized when the principles are applied. This is because principles of justice are conceived as solutions to problems rather than outcomes of procedures. We begin not with generally weak and widely acceptable ideas about persons and society but rather with particular problems faced by individuals. Moreover, it is in formulating the problem clearly that we are directed to its solution, since the problem contains recourses that will point us in the direction of its solution. It is with these resources that the practical account of constructivism in part begins. Consequently, the conceptual starting points are in part located in the good realized once the solution is applied and the problem resolved.

Christine Korsgaard offers a variant of this formulation by characterizing the concept of justice as a solution to a distribution problem concerning collectively created goods (C. Korsgaard 2003). A conception of justice is a particular solution to this problem; it should answer questions like: Who gets what? Who makes what? How much of what one makes should one get? Who is excluded from getting what others have made? A society can consistently answer these questions over time by referring to—implicitly or explicitly—principles of justice. These principles might assign rights or entitlements to individuals, or they might ensure fair and open access to the courts, or they might protect political voice. In each case they must express a particularly thick conception of political right by providing a fairly specific solution to the problem. For example, a conception of justice might express a libertarian set of principles, such as Nozick’s principles of acquisition and transfer; or it might express a liberal egalitarian principle, such as Rawls’ difference principle. On Korsgaard’s view, the task of practical philosophy is to move from abstract normative concepts, such as justice, to a particular normative conception, such as Rawls’s justice as fairness, “by constructing an account of the problem reflected in the concept that will point the way to a conception that solves the problem” (C. Korsgaard 2003, 116). Constructivism does this by conceiving normative concepts and principles as functional—they play a particular role in helping solve the various practical problems that arise in social life. In the absence of such problems, constructivism does not have a toehold from which to begin constructing principles of justice.

There are two important features of a practical formulation of constructivism. First, the resources for constructing principles are in part located in the practical problems humans face, or more precisely the good brought about when those problems are solved. Consequently, we must first look to the nature of the problem before we can understand why principles are obligatory and for what reasons they are authoritative. Second, “a sufficiently detailed and accurate description of the problem actually yields the solution” (C. Korsgaard 2003, 115). This is because the precepts of practical reason and conceptions of persons from which principles are constructed arise from within the problem itself. To see this, consider Korsgaard’s moral constructivism, which in its most recent formulation is primarily concerned with the problem of agency, or the question: How is it possible for a person to act autonomously and effectively over time? (C. M. Korsgaard 2009). Korsgaard begins with the observation that humans are free; it is an inescapable fact of life that we are free to choose and act. The process of acting freely is at the same time a process of constructing our identities over time. If we are to construct unified lives, we need both the freedom to act and a set of principles for determining the reasons on which we act. In the absence of freedom our choices would fail to be our own and we would cease being the authors of our lives. In the absence of principally determined action, our choices would be arbitrary and we would fail to create unified lives reflecting identity and integrity. The problem is to articulate a concept of freedom that is also law abiding. Korsgaard adopts Kant’s Categorical Imperative as the solution, since the categorical imperative tells us to act in such a way that the rule on which one acts can be adopted as a law by all rational persons. Insofar as the imperative recognizes the action as being caused by the person, it preserves freedom. Insofar as it requires the universalization of the rule on which the action is based, it preserves lawfulness. Indeed, Korsgaard thinks the Categorical Imperative principle is constitutive of autonomous, effective agency. That is, we simply cannot understand ourselves as autonomous, unified agents without also ascribing to ourselves this particular principle of practical reason. Consequently, a sufficiently detailed and accurate description of the problem of human agency actually yields the Categorical Imperative as a solution, or so Korsgaard argues.

Korsgaard is admittedly concerned with a constructivist account of practical reason rather than a constructivist account of justice or legitimacy. Whether such a constructivist project is plausible is beyond the scope of this article (see “Constructivism in Metaethics”). But what is important about Korsgaard’s constructivism is that it articulates a notably different structure of justification than the one expressed by procedural formulations. According to proceduralism, principles are justified when they result from a suitably framed procedure, similar to the way presidents become legitimate by running in fair and open elections. By contrast, Korsgaard justifies principles in terms of their function—they solve practical problems. Moreover, principles are objective when they uniquely solve the problem, that is, when there exists no competing principle that can also solve the problem. To illustrate the point, Korsgaard draws on Rawls’ Political Liberalism and the problem of liberalism. Rawls describe the problem of liberalism as follows: “[H]ow is it possible for there to exist over time a just and stable society of free and equal citizens, who remain profoundly divided by reasonable religious, philosophical, and moral doctrines?” (Rawls 1993, xx, xxvii, 4). Korsgaard thinks Rawls’ two principles solve this problem insofar as they describe what a liberal society must do in order to be liberal (C. Korsgaard 2003, 115). Consequently, Rawls’ conception of justice is justified; it functions so as to solve the problem of liberalism. However, Rawls’ conception of justice is not the only justified conception, since other liberal conceptions can also solve the problem. It follows that Rawls’s principles are justified but not objectively so, since they do not uniquely solve the problem. Consequently, rival conceptions of liberalism are equally defensible insofar as they function equally well. In order to construct a uniquely objective set of principles, one must abstract a common core from the several justified sets of principles. Rawls does something like this when he identifies three abstract principles characteristic of any liberal society. These include: (1) the specification of certain basic rights, liberties and opportunities; (2) an assignment of priority to those rights with respect to claims of the general good, and (3) some measure assuring to all citizens adequate all-purpose means to make effective use of their rights (Rawls 1993, 6). Although neither Rawls nor Korsgaard makes this argument, it is possible to think of this abstract core as an objective set of principles constitutive of liberalism, since one could hardly describe a liberal society without also presupposing them as governing principles.

Another way to frame Rawls’ Political Liberalism within a practical formulation of constructivism is in terms of the moral concerns implicit in the problem of liberalism. These concerns can serve as criteria for assessing whether principles solve the problem. Again, take Rawls’ problem of liberalism. It asks how it is possible for there to exist over time a just and stable society of free and equal citizens, who remain profoundly divided by reasonable religious, philosophical, and moral doctrines (Rawls 1993, xx, xxvii, 4). Notice that the problem attaches a set of concerns to a fact about society. The relevant fact is the fact of pluralism—in a liberal society citizens are profoundly divided by reasonable conceptions of a good life. The concern is social stability given this fact. The challenge is to find a set of principles that answer this concern. In addition to the fact and concern, the problem of liberalism expresses a conception of reasonableness. Citizens are reasonable when they are both (a) ready to propose fair terms of cooperation they reasonably believe those to whom they are offered can reasonably accept and (b) appreciate certain factors, or burdens of judgment, that render it impossible to fully reconcile disagreements over all matters of value, including some matters of justice  (Rawls 1993, xliv, 58–59). Certain comprehensive doctrines—the stuff of pluralism—become reasonable when those holding them recognize the social implications of the burdens of judgment, and allow the effects of this recognition to take root in one’s “attitude (including toleration) toward other comprehensive doctrines” (Rawls 1993, 375).

With these building blocks in place, we can begin to see how the problem of liberalism points the way toward a solution. The problem expresses concerns and concepts that can be formulated as criteria for assessing principles of justice. For example, the criterion of reciprocity obliges citizens to defend their political positions with reasons they honestly believe those to whom they are offered might reasonably accept (Rawls 1993, xliv). It is implicit in the concern for stability among reasonable citizens profoundly divided by reasonable conceptions of a good life. Consequently, the problem of liberalism contains within it the resources for articulating the standards against which competing conceptions of justice can be assessed. If citizens find the formulation of the problem compelling, they will simultaneously agree on these standards, since these standards are already implicit in the formulation of the problem. Conceptions of justice meeting these standards are justified because they answer the concerns reflected in the problem and thus function so as to solve the problem.

This is a powerful form of political argument. Its essential point is that the epistemic standards for assessing rival conceptions of justice are internal to the problems we encounter in social life. Analyses of these problems can uncover the standards against which principles are justified. This creates a straightforward, instrumental assessment of political principles and the public policies based on them. Principles and policies are justified when they answer the concerns implicit in the problem. The moral authority of these principles and policies is felt by anyone recognizing the problem.

The practical interpretation of constructivism is not without its difficulties, since the justification of principles hinges on the description of the problem. It is not obviously the case that people will agree on the formulation of the problem. For example, if one accepts Rawls’ description of the problem of liberalism, then one is also committed to accepting some conception of liberal justice as binding on social practices. But one might not accept Rawls’ description of the problem and thus fail to see how the principles solving Rawls’ problem are binding on his or her actions. Consequently, the practical interpretation of constructivism shifts the question of justification onto the descriptions of problems. This mirrors the way in which a proceduralist formulation shifts the question of justification onto the account of procedures. In each case, the justification of principles first requires a defense of something else, the procedure or the problem.

Korsgaard and Rawls represent different directions for addressing this difficulty. Korsgaard hopes to ground the description of agency on generally weak and widely acceptable ideas about freedom and unity. By contrast, Rawls localizes the description of the problem to a particular domain of political concern. These two directions mirror the two directions taken by those developing a procedural formulation of constructivism. In both cases the idea is to offer a better defense of the fundamental elements from which principles are constructed, for in the absence of such a defense the principles themselves will lack justificatory force.

6. Conclusion

In his Reconstruction and Critique of A Theory of Justice, Robert Paul Wolff suggests that the problem with which Rawls begins is not the impasse in our recent political history concerning the conflict between freedom and equality, but rather “the impasse in Anglo-American ethical theory at about the beginning of the 1950’s” (Wolff 1977, 11). This latter impasse concerns the debate between utilitarianism and intuitionism during the first half of the twentieth century. Wolff interprets Rawls as trying to advance normative political theory beyond this impasse by drawing on each position’s respective strengths without succumbing to their fatal flaws. The strength of utilitarianism is its straightforward assertion of human happiness as the metric by which moral right is measured. It offers a clear, plausible, and constructive criterion for settling moral disputes on reasons all can understand. Its fatal flaw, however, is that the metric itself—overall human happiness—can also serve as a reason for violating the individual autonomy and freedom of persons. Intuitionism avoids this fatal flaw by flatly asserting the inviolability of human autonomy and freedom, thus protecting individuals against those who might sacrifice human rights in order to achieve a greater good. Its fatal flaw, however, is that it offers no reason for treating autonomy and freedom as inviolable, and thus fails to explain why these features of human dignity place moral constraints on actions that might otherwise produce some valuable end.

Wolff interprets Rawls as sketching a way out of this impasse by developing an account of practical reason that grounds the metric of moral assessment on reasons all can understand. For “without rational grounds for choosing one system of ends or goals rather than another… we would be forced to retreat to the subjectivity of prudence, as utilitarianism, for all its efforts to the contrary, ultimately does; or else we would, in desperation, simply have to posit substantive objective moral principles without a suggestion of rational argument, as does intuitionism” (Wolff 1977, 20).

The impasse Wolff describes is indeed the impasse constructivism tries to break. In each of the variants described above, the aim is to provide a method of analysis by which a set of principles can be justified. This is accomplished by defending—or making plausible—the use of certain fundamental elements in the construction of a favored set of principles. Moreover, the analysis should be as clear and as easy to follow as a utilitarian analysis. Indeed, it is in the clarity of the analysis that constructivism’s greatest impact ultimately rests, since the clarity of the analysis represents a compelling form of political argument. What constructivism is ultimately concerned with is the nature of normative political argument and each variant described above can be interpreted as an effort to find a compelling form of political argument that can justify normative political principles. In short, it seeks a methodology of substantive justification (James 2013, 251). The various political principles constraining public policy are the result of this methodology. Or, to put the same point the other way around, the method of political analysis constructs the principles. Apart from these constructions, there are no moral facts or true political judgments, nor are there ways of assessing the moral appropriateness of political action. It is only when deliberations are properly constrained by a particular methodology that the resulting product is a principle against which our policies can be assessed as right or wrong. If the methodology or form of political argument is compelling, then a basis for settling fundamental political questions can be established on reasons all can understand. Although such a basis cannot guarantee agreement, it should at least narrow our political debates by cementing the point at which disagreements arise, bringing out into the light of day the reasons why people arrive at political judgments that are not only different but are sometimes incommensurable. Consequently, the holy grail of political constructivism is not a set of principles we can all agree upon, but rather a method of normative political analysis so compelling that no clear headed person can plausibly deny without also appearing entirely tone deaf to the kinds of concerns peculiar to political life. Such a method would enable society to deal with its political problems in a constructive manner by systematically building upon previous successes in an ongoing struggle to make the public domain as just as it can possibly be.

7. References and Further Reading

  • Barry, Brian. 1991. Theories of Justice. University of California Press.
  • Barry, Brian. 1995. “John Rawls and the Search for Stability.” Ethics 105 (4): 874–915.
  • Cohen, G. A. 2008. Rescuing Justice and Equality. Cambridge, Mass.: Harvard University Press.
  • Daniels, Norman. 1996. Justice and Justification: Reflective Equilibrium in Theory and Practice. Cambridge University Press.
  • Darwall, Stephen, Allan Gibbard, and Peter Railton. 1992. “Toward Fin de Siècle Ethics: Some Trends.” The Philosophical Review 101 (1): 115–89. doi:10.2307/2185045.
  • Dworkin, Ronald. 1973. “The Original Position.” The University of Chicago Law Review 40 (3): 500–533. doi:10.2307/1599246.
  • Hare, R. M. 1973. “Rawls’ Theory of Justice—I.” The Philosophical Quarterly 23 (91): 144–55. doi:10.2307/2217486.
  • James, Aaron. 2012. Fairness in Practice: A Social Contract for a Global Economy. Oxford University Press.
  • James, Aaron. 2013. “Political Constructivism.” In A Companion to Rawls, J. Mandel and D.A. Reidy, 251–64. John Wiley & Sons.
  • Klosko, George. 1993. “Rawls’s ‘Political’ Philosophy and American Democracy.” American Political Science Review 87 (02): 348–59. doi:10.2307/2939045.
  • Korsgaard, C. 2003. “Realism and Constructivism in Twentieth-Century Moral Philosophy.” Journal of Philosophical Research 28: 99–122.
  • Korsgaard, Christine M. 2009. Self-Constitution: Agency, Identity, and Integrity. Oxford; New York: Oxford University Press.
  • Lenman, James and Yonatan Shemmer, eds. 2012. Constructivism in Practical Philosophy. Oxford: Oxford University Press.
  • Nagel, Thomas. 1973. “Rawls on Justice.” The Philosophical Review 82 (2): 220–34. doi:10.2307/2183770.
  • O’Neill, Onora. 1988. “The Presidential Address: Constructivisms in Ethics.” Proceedings of the Aristotelian Society 89: 1–17.
  • O’Neill, Onora. 1996. Towards Justice and Virtue: A Constructive Account of Practical Reasoning. Cambridge; New York: Cambridge University Press.
  • O’Neill, Onora. 2003. “Constructivism in Rawls and Kant.” Edited by Freeman, Samuel. The Cambridge Companion to Rawls, 347–67.
  • Rawls, John. 1993. Political Liberalism. Columbia University Press.
  • Rawls, John. 1999a. A Theory of Justice, Revised Edition. Harvard University Press.
  • Rawls, John. 1999b. Collected Papers. Harvard University Press.
  • Rawls, John. 2001. The Law of Peoples: With “The Idea of Public Reason Revisited.” Harvard University Press.
  • Roberts, Peri. 2007. Political Constructivism. Routledge.
  • Sandel, Michael J. 1998. Liberalism and the Limits of Justice. 2nd ed. Cambridge University Press.
  • Street, S. 2010. “What Is Constructivism in Ethics and Metaethics?” Philosophy Compass 5 (5): 363–84.
  • Valentini, Laura. 2011. “Global Justice and Practice-Dependence: Conventionalism, Institutionalism, Functionalism.” Journal of Political Philosophy 19 (4): 399–418. doi:10.1111/j.1467-9760.2010.00373.x.
  • Wolff, Robert Paul. 1977. Understanding Rawls: A Reconstruction and Critique of A Theory of Justice. First Edition edition. Princeton, N.J: Princeton University Press.

 

Author Information

Michael Buckley
Email: michael.buckley@lehman.cuny.edu
City University of New York
U. S. A.

The Ethics of Economic Sanctions

Economic sanctions involve the politically motivated withdrawal of customary trade or financial relations from a state, organisation or individual.  They may be imposed by the United Nations, regional governmental organisations such as the European Union, or by states acting alone.

Although economic sanctions have long been a feature of international relations, the end of the Cold War in the late 20th century saw significant proliferation of their use.  The sanctions made concerted international action possible where previously any action by the West was countered by the U.S.S.R. and vice-versa.    This meant that for the first time the United Nations Security Council could impose economic sanctions that, in theory at least, all member states were required to take part in.  With this came the possibility to inflict serious damage.  Most notable during this period were the comprehensive sanctions imposed on Haiti, the former Yugoslav republics and Iraq.  The harms caused to Haiti and the former Yugoslav republics were severe, but the harms suffered by Iraq were the worst ever caused by the use of economic sanctions outside of a war situation.  UNICEF, for example, estimated that the economic sanctions imposed on Iraq led to the deaths of 500,000 children aged under five from malnutrition and disease.

Following the devastation caused by economic sanctions in Iraq, a wide variety of organisations began to seriously investigate the possibility of alternative forms of economic sanctions, sanctions not targeted against ‘ordinary people’ but rather targeted against those considered to be morally responsible for the objectionable policies of the target state.  The results—‘targeted’ economic sanctions—became the UN’s economic sanctions tool of choice throughout the 2000s.  Targeted economic sanctions include measures such as freezing the assets of top government officials or those suspected of financing terrorism, arms embargoes, nuclear sanctions and so on.  The harms inflicted by targeted sanctions are, for the most part, much less extensive than those inflicted by previous episodes of economic sanctions which targeted entire populations.  Nevertheless, they are not harmless and may still be morally problematic.  For example, the arms embargo imposed during the break up of the former Yugoslavia was widely criticised as it did not permit the Bosnian Muslims to acquire the weapons they needed to defend themselves from the genocidal attacks of certain Bosnian-Serb forces.

Despite the obvious and serious moral problems associated with economic sanctions, the ethics of economic sanctions is a topic that has been curiously neglected by philosophers and political theorists.  Only a handful of philosophical journal articles and book chapters have ever been published on the subject.  This article describes the work that has been carried out.

Table of Contents

  1. The Nature of Economic Sanctions
    1. Definition
    2. Objectives
      1. Achievement of Foreign Policy Goals
      2. International Law Enforcement
    3. Mechanisms
      1. Economic Pressure
      2. Non-Economic Pressure
      3. Direct Denial of Resources
      4. Message Sending
      5. Punitive Mechanisms
    4. Summary
  2. The Ethics of Economic Sanctions
    1. Just War Theory
      1. Objections to the Use of Just War theory: Christiansen and Powers
      2. Further Objections to the Use of Just War Theory
    2. Theories of Law Enforcement
    3. Utilitarianism
    4. “Clean Hands”
    5. Summary
  3. References and Further Reading
    1. On the Nature of Economic Sanctions
    2. On the Ethics of Economic Sanctions
    3. Other Referenced Works

1. The Nature of Economic Sanctions

a. Definition

Economic sanctions are the deliberate withdrawal of customary trade or financial relations (Hufbauer et al., 2007), ordered by a state, supra-national or international governmental organisation (the ‘sender’) from any state, sub-state group, organisation or individual (the ‘target’) in response to the political behaviour of that target.

The specific elements of this definition merit some discussion. First, economic sanctions may comprise the withdrawal of customary trade or financial relations in whole or in part.  Trade may be restricted in its entirety by refusing all imports and exports.  If all imports and exports are refused then the sanctions are known as ‘comprehensive’ sanctions.  (Though note that even in the case of comprehensive sanctions humanitarian exemptions are usually made, for example, for food and medicine).  In other cases, only some imports or exports are refused—usually commodities like oil and timber—or weapons in the case of arms embargoes.  Financial restrictions include measures such as asset freezes, the denial of credit, the denial of banking services, the withdrawal of aid and so on.  Again, withdrawal of financial relations may be comprehensive or not.

Second, economic sanctions may be ordered (or ‘imposed’) by a variety of actors.  Sanctions can be ‘multilateral’, ordered by the United Nations or regional organisations such as the European Union, or they can be ‘unilateral’, ordered by one state acting alone.  The actor ordering economic sanctions is typically known as the ‘sender’ of the sanctions.

In practical terms, contemporary economic sanctions are imposed by following a legal process.  For example, economic sanctions mandated by the United Nations Security Council are required to be adopted by all member states under chapter VII of the United Nations Charter.  States then pass legislation prohibiting their citizens from entering into trading and/or financial relationships with the target and setting penalties for sanctions-breaking.  So although we often talk of sanctions being ‘imposed’ on the target, it should be clear that economic sanctions are actually legal measures imposed by a sender on its own members.  It is a sender’s own citizens who are prohibited from trading.

Further, note that this definition excludes measures undertaken by non-state actors, for example, consumer boycotts or boycotts undertaken by companies or religious organisations.  Such measures are undeniably worthy of ethical enquiry; however, the ethical concerns they present are sufficiently distinctive to make it sensible to treat them as a separate issue.

Third, states are not the only targets of economic sanctions.  Economic sanctions can be, and often are, imposed on sub-state groups.  Well known examples from the recent past are the sanctions imposed on Serb-controlled areas of the former Yugoslavia in the 1990s or the ban on trade in conflict diamonds that targeted sub-state rebel groups in parts of Africa.  Economic sanctions can also be imposed on companies, organisations and individuals.  For example, the UK regularly freezes the UK-held assets of companies, charities or individuals suspected of funding terrorist activities.  For this reason it is perfectly possible for a state to sanction its own citizens.  Those on the receiving end of economic sanctions are typically known as the ‘target’.

In recent years there has been a shift away from targeting entire states, and towards targeting economic sanctions more narrowly at specific sub-state groups and individuals—those considered responsible for the political behaviour the sanctions are responding to.  The reasons for this are two-fold.  First, it is expected that such sanctions are more likely to achieve their objectives.  Second, it makes it less likely that the harms of sanctions will fall on innocent people.  Economic sanctions that are narrowly targeted in this way are known as ‘targeted’ or ‘smart’ sanctions.  There is no common term for sanctions imposed on an entire state.  This entry suggests ‘collective’.

Fourth, under this definition, economic sanctions are imposed in response to the political behaviour of the target—as distinguished from its economic behaviour.  Such a stipulation is common in the economic sanctions literature.  For example, Robert Pape distinguishes economic sanctions from what he calls ‘trade wars’:

When the United States threatens China with economic punishment if it does not respect human rights, that is an economic sanction; when punishment is threatened over copyright infringement, that is a trade war (Pape, 1999, 94).

However, not everyone accepts this distinction.  David Baldwin, for instance, denies that economic sanctions must be a response to political behaviour.  For Baldwin economic sanctions can be a response to any type of behaviour—there is no reason to restrict the definition of economic sanctions to those measures which aim to respond to political behaviour.  Thus, contra Pape, Baldwin argues that if the U.S imposes restrictions on trade with China over copyright issues then this is an economic sanction.  Further, he argues that in any case there is no clear-cut distinction between the ‘political’ and the ‘economic’ and so there would be no clear-cut basis for making the distinction even if it were warranted (Baldwin, 1985).

In response to Baldwin, it is worth pointing out that in common usage the term ‘economic sanctions’ is actually reserved for a distinctive class of cases that we can roughly describe as being a response to political rather than economic behaviour.  Baldwin is right that there is no clear-cut distinction between the political and the economic, but to categorise responses to both as economic sanctions is to ignore the fact that people do actually manage to make the distinction in practice.

Finally, the definition presented here makes no reference to the objective sought by economic sanctions or the mechanism by which they are expected to work.  This is an advantage since both the question of the proper objectives of sanctions and the question of how they work, are controversial.

b. Objectives

Economic sanctions theorists tend to conceptualise economic sanctions in one of two ways: as tools of foreign policy or as tools of international law enforcement.  As tools of foreign policy, their objective is to achieve foreign policy goals.  As tools of international law enforcement, their objective is to enforce international law or international moral norms.

i. Achievement of Foreign Policy Goals

Economic sanctions are most commonly conceptualised as being tools for achieving foreign policy goals.  They are considered part of the foreign policy ‘toolkit’ (a range of measures that includes diplomacy, propaganda, covert action, the use of military force, and so forth) that politicians have at their disposal when attempting to influence the behaviour of other states.  The foreign policy conception comes in both simple and more sophisticated versions.

In the simple version, the objective of economic sanctions is to change or prevent a target’s ‘objectionable’ policy or behaviour where a policy or behaviour is understood to be ‘objectionable’ if it conflicts with the foreign policy goals of the sender.

However, a frequent criticism of economic sanctions is that—if these are their goals—then economic sanctions don’t work.  That is, they usually fail to change or prevent a target’s objectionable policy or behaviour (Nossal, 1989).  This concern has led some to ask the question: if economic sanctions don’t work, why do we keep using them?   The attempt to answer this question has led some theorists to develop more sophisticated conceptions of economic sanctions.

It has been argued, for instance, that although changing a target’s ‘objectionable’ policy or behaviour is sometimes the objective of economic sanctions, politicians often employ economic sanctions in much more nuanced and subtle ways (Baldwin, 1985, Cortright & Lopez, 2000).

First, Baldwin argues that economic sanctions are often employed with the more limited objective of influencing a target’s ‘beliefs, attitudes, opinions, expectations, emotions and/or propensities to act’ (Baldwin, 1985, 20).  No immediate policy or behaviour change is expected—even if, in the long—term, some change is hoped for.  In such cases Baldwin argues that economic sanctions are being used symbolically to ‘send a message’.  They can signal specific intentions or general foreign policy orientations or they can be used to show support or disapproval for the policies of other states.  If the economic sanctions are imposed at some cost to the sending state then this demonstrates the sender’s commitment to its position and strengthens the message being sent.  Importantly, even if the objective of an episode of economic sanctions is to ‘send a message’, it is unlikely to feature as the officially stated objective.  The message is stronger if the sanctions are framed as demanding a change in the target’s objectionable policy or behaviour—even if it is clear that the economic sanctions alone cannot hope to change this behaviour.

Second, Baldwin argues that economic sanctions may have multiple objectives of which some will be more important to the sender than others.  Behaviour change might be a sender’s secondary or even tertiary objective whilst ‘sending a message’ might be the primary objective.  Even if the most important objective for the sender is to ‘send a message’, the economic sanctions must be framed as demanding behaviour change if this secondary or tertiary objective is to be met.

Third, economic sanctions may have multiple targets.  For example, if economic sanctions are employed as a general deterrent, then there will be many targets of the influence attempt extending well beyond the original recipient of the economic sanctions (Baldwin, 1985).

David Cortright and George A. Lopez have also worked on developing more sophisticated understandings of economic sanctions.  Economic sanctions, they argue, can be imposed for purposes that include deterrence, demonstrating resolve, upholding international norms and sending messages of disapproval as well as influencing behaviour change (Cortright & Lopez, 2000).

Finally, Kim Richard Nossal argues that senders might also have retributive punishment as their objective.  In other words the intent is to inflict economic harm on a target they regard to have wronged them solely for its own sake and not to achieve any change in behaviour or policy.  For Nossal, to be clear, saying a sender has been ‘wronged’ is not to say it has been morally wronged.  It is only to say that the target’s actions have displeased the sender.  Thus, on Nossal’s account, senders can ‘punish’ agents who—objectively—have done nothing morally wrong—just as a mafia boss might ‘punish’ underlings who have been passing information to the police.  Again, it is important to realise that even if the purpose of the economic sanctions is retributive punishment, it is unlikely to be stated as such by the sender for fear of appearing irrational or vindictive (Nossal, 1989).

For all these reasons it would be a mistake to assume from the fact that economic sanctions often fail to achieve their stated objectives that economic sanctions do not work; stated objectives are not always true objectives.  The true objectives might be to punish or to send a message.  Even when the stated objectives are true objectives they may not be the primary objectives.

Given the above discussion, it appears that changing or preventing objectionable policies or behaviour, ‘sending a message’, and punishment are all possible objectives of economic sanctions.

ii. International Law Enforcement

Alternatively, economic sanctions are sometimes conceptualised as being a tool for enforcing international law or international norms of behaviour.  On this conception, the ultimate objective of economic sanctions is understood to be international law enforcement.

For Margaret Doxey, enforcement of the law through the use of economic sanctions might take several forms.

First, enforcement might involve the ending of ongoing violations of international law/norms—the domestic analogy is that of stopping a crime in progress.  Doxey’s own example is that of economic sanctions imposed to reverse the illegal invasion of the Falklands Islands by Argentina (Doxey, 1987, 91).

Second, enforcement might require preventing violations of international law from occurring in the first place.  The domestic equivalent is that of preventing a known criminal conspiracy from being realised.  As Doxey notes, under chapter VII of the UN Charter, given adequate support from its members, the Security Council can designate any situation a threat to peace and then order preventive action to ensure that the threat is not realised (Doxey, 1987, 91)

Third, enforcement might require that economic sanctions are imposed punitively subsequent to violations of international law to deter either the recipient state or others from repeating the violations. Here economic sanctions are ‘a kind of fine for international misbehaviour’ (Doxey, 1987, 92).

The main difference between the law enforcement and the foreign policy conceptions of economic sanctions is that the former claims that the objectives of economic sanctions are purely to enforce international law/international norms of behaviour, whereas the latter claims that the objectives of economic sanctions are determined by a sender’s foreign policy.  Of course the two conceptions are not mutually exclusive.  A given sanctions episode may align with a sender’s foreign policy goals and work to enforce international law.

This difference between the two conceptions can partially be explained with reference to the focus of the respective theorists’ studies: those employing a foreign policy conception tend to focus on cases where states are the senders of economic sanctions, whereas those employing a law enforcement conception tend to focus on cases where the UN is the sender.  Undoubtedly the foreign policy conception fits states better than the UN and the law enforcement conception fits the UN better than states.  However, it would be wrong to say that the foreign policy conception applies to states and the law enforcement conception to the UN.  States can also act to enforce international law.  Likewise, the UN is not immune to the national interests of its more powerful member states.

To summarise then, these are the possible objectives of economic sanctions:

  1. To change or prevent objectionable or unlawful policies or behaviour
  2. To send a message with regards to objectionable or unlawful policies or behaviour
  3. To punish objectionable or unlawful behaviour on deterrent or retributive grounds

c. Mechanisms

Whatever the objectives of economic sanctions, we also need to address the question of how economic sanctions work.  Five mechanisms are discussed here: economic pressure, non-economic pressure, direct denial of resources, message sending and punitive mechanisms.

i. Economic Pressure

Theorists of economic sanctions began addressing the question of how economic sanctions worked in the 1970s and 80s and took as their model collective sanctions imposed on states—as this was the predominant mode of sanctioning at the time.  They theorised that economic sanctions achieved behaviour/policy change via the imposition of economic pressure.  Robert Pape sums this view up well when he states that economic sanctions ‘seek to lower the aggregate economic welfare of a target state by reducing international trade in order to coerce the target government to change its political behaviour’ (Pape, 1997, 94).  In elaborating on this mechanism Pape argues that:

Targets of economic sanctions understand they would be better off economically if they conceded to the coercer’s demands, and make their decision based on whether they consider their political objectives to be worth the economic costs. (Pape, 1997, 94)

A similar view to Pape is shared by Hufbauer.  They use the following framework to analyse the utility of economic sanctions:

Stripped to the bare bones, the formula for a successful sanctions effort is simple: The costs of defiance borne by the target must be greater than its perceived cost of compliance.  That is, the political and economic costs to the target from sanctions must be greater than the political and security costs of complying with the sender’s demands. (Hufbuaer, 2007, 50)

Indeed, the view that economic sanctions work via the imposition of economic pressure is the most widely accepted in the literature.  Johann Galtung even calls it ‘the general theory of economic sanctions’ and he elucidates as follows.  Focussing on collective economic sanctions, Galtung argues that the objective of economic sanctions is to cause an amount of economic harm sufficient to bring about the ‘political disintegration’ of the state which, in turn, will result in the state being forced to comply with the sender’s demands.  For Galtung ‘political disintegration’ is a split in the leadership of a state or a split between the leadership and the people that occurs as people within the state disagree about what to do with regards to the sanctions and the resulting economic crisis.  This may involve popular protest and the government being forced to change the objectionable or unlawful policy for fear of losing power.  Under what Galtung calls the ‘naïve theory’ of economic sanctions (which he rejects), the more severe the economic pressure, the faster and more significant the political disintegration and the sooner the state will comply.  This theory is naïve, Galtung argues, because it does not take into account the fact that sanctions might—at least initially—result in political integration, as the people of the state pull together in the face of adversity.  This is especially likely to occur if the target government can muster up the spirit of nationalism.  Indeed, ‘rally-round-the-flag’ effects are often cited as a reason for the failure of economic sanctions.  Under Galtung’s ‘revised theory’ of economic sanctions, economic pressure results initially in political integration but will eventually lead to political disintegration as economic pressure increases but, he warns, the levels of economic harm required for this might in some cases be exceptionally severe (Galtung, 1967).

With regards to targeted sanctions, it seems possible that they could also sometimes operate via an economic pressure mechanism.  For example, asset freezes on top government officials might pressure them into changing the objectionable or unlawful policy/behaviour if the amounts involved were significant enough.

ii. Non-Economic Pressure

Baldwin, however, argues that although economic pressure is one possibility for how economic sanctions might work, it is not the only one.  In particular, he argues that economic sanctions do not have to cause economic harm to work.  He argues that even if the economic sanctions make barely a dent in a target state’s economy, its government may be moved to act out of a concern to avoid international embarrassment or a reputation as a pariah state.  This is particularly likely to occur when targets believe themselves to be members in good standing of international society.  Suffering international condemnation might be unacceptable to them.  In other cases Baldwin argues that targets might worry that the economic sanctions are a prelude to war.  Since a just war must be a last resort, those about to resort to war often impose sanctions first—either in a genuine attempt to reach a non-military resolution or, more cynically, to demonstrate to domestic and international audiences that non-military methods have been attempted and failed—thus making war the last resort. A target might comply with the economic sanctions not because they damage the economy but out of concern to avoid war (Baldwin, 1985).  The pressure employed here does not derive from the economic effects of the sanctions.  Both collective and targeted economic sanctions may utilise a non-economic pressure mechanism.

iii. Direct Denial of Resources

Economic sanctions employing either the economic or non-economic pressure mechanisms work only indirectly: pressure is applied to targets to force them to change their objectionable/unlawful policies themselves.  Thus such sanctions are sometimes referred to as ‘indirect’ sanctions (Gordon, 1999).

However, economic sanctions can also operate directly by denying a target the resources necessary for pursuit of their objectionable/unlawful policy.  For example, if the objectionable/unlawful policy of that target state is its militarisation, then economic sanctions might be designed to damage a target state’s economy so thoroughly that it does not have the resources available to build up or maintain its military capacity, or they might involve arms embargoes or nuclear sanctions.  Similarly, asset freezes of either state funds or the funds of government officials may operate with a direct mechanism.  Freezing Libya’s state funds and the funds of Colonel Gadaffi was intended to make it impossible for him to pay mercenaries during the Arab Spring.  Plus the freezing of assets suspected of belonging to terrorist groups is intended to make financing terrorist operations more difficult.  Such ‘direct sanctions’ do not apply pressure to the target to change their objectionable/unlawful policy themselves but instead work directly by denying the target the resources it needs to pursue the objectionable/unlawful policy.

iv. Message Sending

Of course, not all economic sanctions aim to change or prevent an objectionable/unlawful policy.  Some aim only to ‘send a message’.  If the objective of the economic sanctions is simply to ‘send a message’ then the imposition of sanctions in itself should be sufficient to achieve this—causing economic harm should not be necessary.  Having said this, there are undoubtedly ways of making the message stronger and causing some economic harm to the target might do this.  Of course, as both Baldwin and Doxey note, this is not the only way to strengthen the message.  If the sanctions are costly to the sender—because, for instance, they involve putting a stop to valuable exports, this willingness of the sender to bear costs shows how seriously it takes the situation.

v. Punitive Mechanisms

Punishment necessarily involves the infliction of some harm, suffering or otherwise unpleasant consequences on the target, and this is the case whether the objective of the punishment is to deter or whether the punishment is purely retributive in nature.  Thus economic sanctions imposed as punishment must either inflict some economic harm or, if a target state (or organisation/individual) is particularly sensitive about its standing in the international community, symbolic sanctions expressing international condemnation might suffice as punishment.

d. Summary

The table below summarises the possible objectives of economic sanctions, together with each objective’s related mechanism(s).

summary

2. The Ethics of Economic Sanctions

At least four moral frameworks have been used to consider the ethics of economic sanctions: just war theory, theories of law enforcement, utilitarianism, and ‘clean hands’.

a. Just War Theory

Of the few writers who have considered the ethics of economic sanctions, the majority point to the analogies between economic sanctions and war and use just war theory as a framework within which to assess their moral permissibility.  Some extend the framework only to collective, comprehensive economic sanctions (Gordon, 1999) while others extend it to all types of economic sanctions (Pierce, 1996, Winkler, 1999, Amstutz, 2013).

Just war theory is split into two parts: jus ad bellum, which sets out the principles that must be followed for the resort to war to be just and jus in bello, which sets out the principles that must be followed during war.  (Some just war theorists add a third part, jus post bellum, which sets out the principles that must be followed post-war, but since no writers on economic sanctions consider jus post bellum, it has been left out of the following analysis).  Those writers who employ just war theory as a moral framework believe that these principles of just war theory can—with minor adjustments—be appropriate as a moral framework for economic sanctions as follows.

There are six principles of jus ad bellum.  For the resort to war to be just, all six conditions must be met.

Just Cause: There must be a just cause for war.  In mainstream just war theory, just cause is limited to:

  • the defence of a state from an actual or imminent military attack; and
  • humanitarian intervention in cases where a state is committing extremely serious human rights violations against its own citizens.

Theorists applying this principle to economic sanctions widely agree that there is just cause to impose economic sanctions if their aim is:

  • to defend a state from the target’s actual or imminent military attack; or
  • to stop extremely serious human rights violations being carried out by the target against its own citizens.

Some theorists go further and allow greater latitude for the case of economic sanctions, arguing that there is just cause for economic sanctions in situations of serious injustice that nevertheless fall short of just cause for war (Amstutz, 2013).

However, under the just war framework, there is no just cause for economic sanctions with punitive objectives.  Likewise, there is no just cause for economic sanctions imposed preventively, to head off future (but non-imminent) attacks.  The theorists in question do not consider economic sanctions designed to ‘send a message’, but since such sanctions do not aim to defend a state from military attack or to stop serious human rights violations but aim merely to change attitudes, beliefs, and so forth, it would seem that there would be no just cause for them on this approach.  Therefore, economic sanctions designed to punish or to prevent objectionable/unlawful policies or behaviour would be ruled out as would all sanctions designed to ‘send a message’.

Proportionality: The harm that will foreseeably be caused by the war must not be disproportionate to the good that it is hoped will be achieved.  The good consequences to be counted are limited to those specified in the just cause, i.e. putting a stop to any attack or human rights abuses.  Any incidental good consequences, such as the kick-starting of an economy, should not be included in the proportionality calculation.  However, the harmful consequences of war are not limited to certain types and should all be counted.   Further, the calculation must include the harms suffered by all parties to the war and those suffered by neutral states.

For economic sanctions, this principle is met if the good achieved by the sanctions is expected to outweigh the harms of those sanctions.  The good to be counted is the ending of the attack, human rights abuses or other injustice.  The harms to be counted include not just those suffered by target citizens but also those suffered by sender citizens.  It is worth remembering that citizens of sender states can suffer—either directly if their business relies on trade with the target—or indirectly if the economy of the sending state is particularly reliant on trade with the target.

There is nothing essential to the nature of economic sanctions that would prevent the proportionality condition being met.

Right Intention: The decision to go to war must be made with the right intention—the intention to achieve the just cause.  The just cause must not be a pretext for some unjust end that is secretly intended.  Therefore, economic sanctions must be imposed with the intention of defending a state from attack or stopping/reducing human rights violations.  There is nothing essential to the nature of economic sanctions that prevents this condition from being fulfilled.  However, Winkler warns that, as a matter of fact, there is a propensity for economic sanctions to be imposed without clear purpose and this means that the requirement of right intention might not be met in many actual cases (Winkler, 1999).

Legitimate Authority: The decision to go to war must be made by a legitimate authority.  That is, one which has the moral right to act on behalf of its people and take them into a war.  In international law there is a presumption that the governments of all states are legitimate authorities.  According to mainstream just war theory, private individuals may not wage war.  According to A. J. Coates, war is a legal instrument, and the power to enforce the law is vested in the government on behalf the political community.  Thus, private war is an instance of taking the law into your own hands and is a kind of vigilante justice (Coates, 1997).

There is nothing essential to the nature of economic sanctions that would prevent this condition being met.  However, if we take the war/economic sanctions analogy seriously, the legitimate authority condition implies that private boycotts of a target state’s products by individuals, companies or other organisations are wrongful—a kind of vigilante justice.  This is a conclusion that many would be unwilling to accept.

Last Resort: War must be the last resort.  Given the horrendous harms it creates, war must be necessary in order to be just.  If other, less harmful, alternatives are available such as economic sanctions or diplomatic measures, then war is not necessary and therefore not just.  Under just war theory it is not the case that all the alternative measures must actually be attempted first: if it is obvious they would not work then there is no requirement to make such attempts.

Clearly, if war must be the last resort, it cannot be a requirement that economic sanctions are also a last resort.  The equivalent requirement given is that economic sanctions must be the last resort short of war (Winkler, 1999, 145) or that less harmful or less coercive means must be attempted before economic sanctions may be imposed (Amstutz, 2013, 217 ).  Again there is nothing essential to the nature of economic sanctions that would prevent them being the least harmful or coercive means available.  However, it is worth noting that the harmful effects of economic sanctions have been underestimated in the past and it is not inconceivable that the harms of economic sanctions could exceed those of war in a given case.

Reasonable Chance of Success: There must be a reasonable chance of success.  This is to prevent hopeless wars where people die pointlessly.

This condition is particularly pertinent for economic sanctions.  Historically, economic sanctions have been accused of ‘never working’ (Nossal, 1989).  If this were true then economic sanctions would never be morally permissible under just war theory.  However, it is not true.  The most comprehensive study of the effectiveness of economic sanctions to date concluded that economic sanctions succeeded (achieved their stated objectives) in one third of cases (Hufbauer et al., 2007).  This figure is disputed and is not in any case particularly high.  However, it seems fair to say it is not impossible for economic sanctions to work.  Therefore this condition could be met in specific cases.

Having addressed the principles of jus ad bellum, it is clear that some economic sanctions may meet the conditions.  However, it is still necessary to consider jus in bello.  As with jus ad bellum, all the conditions of jus in bello must be met for an individual military action to be morally permissible.  However, there is only one principle that is particularly relevant to economic sanctions and that is the principle of discrimination.

Discrimination: The principle of discrimination requires attackers to distinguish between two classes of people in war: combatants and non-combatants, and stipulates their different treatment.  According to the principle of discrimination, it is morally permissible to attack combatants at any time.  Non-combatants, on the other hand, have immunity from attack, and it is never morally permissible to attack them directly.  However, it is sometimes morally permissible to harm non-combatants as an unintentional side effect of an attack against combatants or military property under the doctrine of double effect.  The doctrine of double effect acknowledges that one action (for example, bombing a weapons factory) can have two effects: the intended effect (destroying a weapons factory) and a foreseen but unintended side effect (killing non-combatants who live nearby).  According to the traditional doctrine of double effect, it is morally permissible to bring about a harmful side effect if it is a foreseen but genuinely unintended consequence of pursuing some good end that is intended—so long as the harm of the side effect is not disproportionate to the intended good end.  Michael Walzer, however, significantly revises the traditional doctrine of double effect and it is worth considering his revision here because most of those writing on economic sanctions use Walzer’s version.  Walzer adds a further condition to the doctrine.  It is not good enough, Walzer argues, that the harm to non-combatants be unintended and not disproportionate, we should expect soldiers to take positive steps to minimise harm to non-combatants, even if this imposes costs to themselves.  As he puts it ‘[d]ouble effect is defensible…only when the two [effects] are the product of a double intention: first, that the ‘good’ be achieved; second that the foreseeable evil be reduced as far as possible’ (Walzer, 2006, 155).   It is only in this case when the side-effect harms to non-combatants are morally permissible.

In the case of economic sanctions though, who are the equivalent of ‘combatant’ and ‘non-combatant’?  Pierce argues that the individuals falling into the class of ‘combatants’ are those who are actually part of the causal chain of events that led to the objectionable or unlawful policy: those who planned and organised it, and those who are carrying it out (Pierce, 1996, 102).  Similarly, for Winkler, combatants are those who plan and carry out the objectionable or unlawful policy (Winkler, 1999, 149).  For Amtutz, combatants are ‘the government and the elites that support it’ (Amstutz, 2013, 217).  Gordon is not clear on who counts as a ‘combatant,’ but she is clear about who she thinks does not: ‘those who are least able to defend themselves, who present the least military threat, who have the least input into policy and military decisions, and who are the most vulnerable’ (Gordon, 1996, 125).  On any of these definitions, it is clear that in cases where a target state is pursuing an objectionable/unlawful policy, there will be both ‘combatants’ and ‘non-combatants’ amongst its citizens.

It is generally agreed by writers employing the just war framework that collective sanctions violate the principle of discrimination.  Where the collective sanctions involve an indirect economic pressure mechanism, economic harms are intentionally inflicted on the population in the hopes they will protest and force their government to change their objectionable policies.  Given that some of the population will count as ‘non-combatants’, this involves the intentional infliction of harm on non-combatants and straightforwardly violates the principle of discrimination.

Where the collective sanctions involve a direct denial of resources mechanism, for example, an attempt to destroy an economy to end a state’s militarisation, the harm to non-combatants is not intended but it is foreseeable and it is still problematic.  In the memorable words of Joy Gordon, such sanctions are like a ‘siege writ large’.  The sanctions prevent the import of goods into a country just as a surrounding enemy army would a castle or city.  Thus sanctions are vulnerable to the same moral criticisms as a siege.  Sieges do not discriminate between combatants and non-combatants.  In fact in a siege it is usually the non-combatants who suffer the most since increasingly scarce resources will be allocated as a matter of priority to the army or leadership.  As Gordon states, in both sieges and in the case of comprehensive collective sanctions ‘the harm is done to those who are least able to defend themselves, who present the least military threat, who have the least input into policy or military decisions, and who are the most vulnerable’ (Gordon, 1999, 125).  Sieges do not discriminate between combatants and non-combatants and they do not demonstrate an intention to minimise harms to non-combatants.  Therefore, even if the harms are not intended, they cannot be justified under Walzer’s revised doctrine of double effect.

In summary, all writers employing the just war principles as a framework justify its use by drawing an analogy between economic sanctions and war.  The just war framework then leads them to conclude that collective sanctions are always impermissible because they violate the just war principle of discrimination.  Pierce, Winkler and Amstutz further extend the use of just war principles to targeted economic sanctions and conclude that targeted economic sanctions that do not harm ‘non-combatants’ may be morally permissible because it is at least theoretically possible that they can meet all the just war principles.  This would appear to be a neat solution to the issue of the ethics of economic sanctions.  However, there are objections to this approach.

i. Objections to the Use of Just War theory: Christiansen and Powers

Christiansen & Powers argue that there are significant differences between the case of war and the case of collective, comprehensive economic sanctions and therefore that the just war principles provide an inadequate framework for the moral analysis of such economic sanctions.  In particular they argue that the principle of discrimination does not apply to the case of economic sanctions.

For them, the most important differences between war and economic sanctions are that (1) economic sanctions are imposed as an alternative to war, not as a form of war (sieges during a war being a form of war), and (2) economic sanctions—if carefully designed and monitored—cause less harm than war.  They argue that the just war principles—in particular the principle of discrimination—exist to prevent military conflicts heading down the road to ‘total war’, a hellish situation where anything goes.  They are an attempt to keep war within some kind of limited civilised control.  However, they argue, the intent behind economic sanctions is to avoid war altogether, to stop us even starting upon the road to total war.  This being so, there is no reason why the principles governing war—including the principle of discrimination—should also govern economic sanctions (Christiansen & Powers, 1996, 101-109).

Of course that still leaves open the question of what principles should govern economic sanctions, particularly when it concerns questions of inflicting harm on ‘non-combatants’ or, as they put it ‘innocent’ people.  Christiansen & Powers argue that in certain cases it is permissible to harm innocent people by means of economic sanctions—even intentionally—so long as their basic rights are not violated.  As they state:

“Another model for thinking about sanctions may be found in the distinction between basic rights and lesser rights and enjoyments.  This may prove more useful than the just war principle of [discrimination] as a paradigm for economic sanctions.  As long as the survival of the population is not put at risk and its health is not severely impaired, aspects of daily life might temporarily be degraded for the sake of restoring the [more basic] rights of others” (Christiansen & Powers, 1996, 107).

Christiansen and Powers go on to argue that there are two further differences between war and economic sanctions that also lend support to abolishing the principle of discrimination.  They argue (1) that a population might consent to suffer economic sanctions in which case harming them would not violate their rights, and (2) that a population can in fact bear moral responsibility for the actions of its government, for example, by supporting or not opposing them, and so not qualify as ‘non-combatant’ or innocent.  They argue that neither of these considerations are available in the case of war.

It is first worth pointing out that they are surely wrong about these considerations not being available in the case of war.  A population suffering severe human rights violations such as ethnic cleansing or genocide might consent to military intervention to help protect them.  Likewise, if we can hold a population morally responsible for the actions of their government because they supported them or did not oppose them, then we can do this whether economic sanctions or war are being considered.  Nevertheless, their arguments that consent or moral responsibility on the part of the innocent population renders harm to that population morally permissible can be considered on their own merits.  Let us consider each in turn.

If an individual genuinely consents to suffer harm then her rights are not violated since she has waived her right to not be harmed in this way.  To give an example, it is often argued that the Black population of South Africa consented to the anti-Apartheid sanctions and that this justified the harms they suffered.  The consent argument, of course, only applies where the innocent population does in fact consent.  This is something that is very difficult to establish.  Further, even if it can be shown that the majority of a population consent to the sanctions, it is unlikely that every last person will do so.  Hence the consent justification is unlikely to justify all targeting of innocent people.

Christiansen & Powers further argue that we can consider a population morally responsible for its government’s policies if they support them or fail to oppose them—at least where the state in question is a democracy and opposition does not meet with serious penalties.  In such cases, they argue, the population is not innocent and so it is morally permissible to target them directly with economic sanctions.  They give the example of the White population of South Africa, arguing that the White population shared responsibility for the Apartheid policies of their government and therefore it was morally permissible to target them directly with economic sanctions.  However, even if it is accepted that supporting or failing to oppose objectionable/unlawful policies renders one morally responsible and non-innocent, it is very unlikely that every last person in a state is actually supporting—or not opposing—the policies.  There is almost always some opposition, however small.  Further, one would not normally attribute moral responsibility for such actions to children.  They remain innocent.  Hence, even if we were to accept the idea that supporting—or even just failing to oppose—one’s government was sufficient for the attribution of moral responsibility—a state would still have some innocent members amongst its population.

Christiansen & Powers conclude by offering their own moral framework which, while clearly influenced by just war theory, has significant differences.  The most significant difference is the absence of the principle of discrimination and two replacement principles as follows:

A Commitment to and Prospects for a Political Solution: Sanctions should be pursued as an alternative to war, not as another form of war.  They must be part of an abiding commitment to and a feasible strategy for finding a political solution to the problem that justified the imposition of sanctions in the first place.

Humanitarian Proviso: Civilians should be immune from grave and irreversible harm from sanctions, though lesser harms may be imposed on the civilian population.  Provision must be made to ensure that fundamental human rights, such as the right to food, medicine, and shelter, are not violated. (Christiansen & Powers, 1996, 114)

ii. Further Objections to the Use of Just War Theory

It has been argued that the revisions made to the just war principles—considered above—do not go far enough.  The just war principles are derived from a set of complex and detailed arguments all planted firmly within the context of war.  These arguments contain premises that, whilst they may hold true in the case of war, do not always hold true in the case of economic sanctions.  Therefore, a much more thoroughgoing revision of just war principles is required if they are to be applied to the case of economic sanctions (Ellis, 2013).

Further, while there are differences between war and collective comprehensive economic sanctions, there are even greater differences between war and targeted economic sanctions.  These also call into question the use of a just war framework (Ellis, 2013).  For example, why should an arms embargo—which aims to prevent or mitigate a war—be considered under the same principles governing the resort to war or the fighting of it?  There is no obvious reason why it should.

b. Theories of Law Enforcement

As we have seen, one way of conceptualising of the economic sanctions is as a tool of international law enforcement: a means to prevent, terminate or punish violations of international law or international moral norms.  Therefore, it would seem natural to analyse the ethics of economic sanctions using a framework based on the ethics of law enforcement. Theorists who have done this (Damrosch 1994, Lang 2008) argue that the use of economic sanctions as a tool of law enforcement faces significant moral challenges as follows.

Legitimate Authority: Many argue that only a legitimate authority has the right to enforce the law.  An authority is considered legitimate if she (or it) is morally justified in exercising that authority. Opinion is divided on what exactly makes an authority legitimate but two oft-cited necessary conditions are (1) the consent of those subject to the authority (either tacit or explicit), (2) impartiality on the part of the authority; that is, the authority should have no reason to favour the interests of one party over the interests of any other (Rodin, 2002, 176-177).

In the domestic case, it is widely accepted that states (at least democratic states) have the legitimate authority to enforce domestic law against citizens.  Therefore agents of the state (police, judges, prison officers) have the legitimate authority to prevent, terminate and punish crime in a way that ordinary citizens do not.  If ordinary citizens attempt to prevent, terminate and punish criminals themselves—without any state involvement—this is closer to vigilantism or revenge than law enforcement.

However, in the international case the picture is more complex.  Although (at least democratic) states are regarded as having legitimate authority over their own citizens, they are not regarded as having legitimate authority over the citizens of foreign states or over foreign states themselves.  First, they lack the consent of foreign citizens or states.  Second, they lack impartiality since, in any international dispute, they are likely to prefer their own national interest over the interest of foreign states or citizens.  This position on the legitimate authority of states is consistent with the fundamental principle of international law that all sovereign states are equal in the international system.

Different considerations apply when it comes to the United Nations.  Is the United Nations a legitimate authority?  The UN certainly does claim the authority to interpret international law and to enforce it—at least in the area of peace and security.  According to the UN Charter, the Security Council has the authority to require that all UN member states impose economic sanctions on those states or individuals it deems a threat to peace and security.  However, many would argue that this authority is illusory since the UN lacks the power to enforce its own judgments on matters of international law.  This is because the UN relies on support of member states to achieve law enforcement, and this is not always forthcoming.  Further, the permanent members of the Security Council can veto any action the UN proposes.  Other critics would argue that whatever de facto authority the UN has, that authority is not legitimate; some question whether the UN really has the consent of member states, others question whether or not the UN, dominated as it is by the five permanent members of the Security Council, is really impartial.

This leads many to conclude that (1) there is no entity in the international system with the legitimate authority to enforce the law, and (2) therefore there is no possibility of morally justified law enforcement at the international level.

Principled Basis: In order to be morally justified on the basis of law enforcement, the sanctions must be a response to violations of genuine international law or international moral norms (Damrosch, 1994).  This is not as straightforward as it sounds.  International law is a very different matter to domestic law; there is considerable dispute about the moral norms that hold sway internationally and whether or not they even count as real laws.  While economic sanctions imposed as a response to the rule against aggression or genocide would pass this test easily, other moral norms are more questionable; to borrow an example from Damrosch, is democratic governance an international moral norm?

Consistency: Law enforcement should be consistent—it is a fundamental principle of justice that like cases are treated alike.  It is unfair if one state or individual is prevented from carrying out an activity or punished for it, when another is not (other things being equal).  Yet, all our evidence to date shows that economic sanctions are not imposed consistently—they are not regularly and reliably imposed on those who violate international law or international moral norms.  With regards to the UN, the national interests of the UN Security Council members are more a guide to the likelihood of sanctions being employed than the fact of a violation (Damrosch, 1994).  The situation for states is no different.  This should not be surprising, consistency in law enforcement is a product of impartiality and neither the UN nor states are impartial.

Harm to Innocents: Economic sanctions that are used to prevent, terminate or punish breaches of international law sometimes intentionally (or at least foreseeably) harm innocent people—those who bear no moral responsibility for the illegality in question.  This is morally problematic because, as a matter of justice, we usually think that the harms of law enforcement and punishment should be directed only at wrongdoers (Lang, 2008; Damrosch, 1994).

Here though it is worth making a distinction between punishment after the fact and law enforcement directed at preventing or terminating violations of law.

In the case of punishment after the fact, it is straightforwardly accepted by most that it is wrong to punish the innocent.  This means that collective sanctions—those aimed at the entire population of a state—are straightforwardly morally wrong if judged as punishment.  They are a type of collective punishment that punishes the innocent along with the guilty.  Targeted sanctions, of course, may be targeted directly at the guilty (or at least those believed to be guilty) and so can avoid this problem.

Lang would extend the prohibition on harming the innocent to all types of law enforcement.  However, Damrosch argues that the case of preventing and terminating violations of law is different.  She argues that if the law being enforced is important enough (for example, if the sanctions are aimed at preventing genocide) then innocents may be intentionally or foreseeably harmed to achieve this.  To be sure, law enforcement measures should be chosen carefully to minimise the suffering of innocent bystanders, but it should not be ruled out altogether (Damrosch, 1994, 67).

c. Utilitarianism

Joy Gordon has used utilitarianism to assess the moral status of comprehensive economic sanctions (Gordon, 1999). According to utilitarianism, an act is right if and only if it maximises utility (i.e. the balance of pleasure over pain or, more generally, of benefit over harm).

According to Gordon, comprehensive economic sanctions are justified on utilitarian grounds in cases where ‘the economic hardship of the civilian population of the target country entails less human harm overall, and less harm to the sanctioned population, than the military aggression or human rights violations the sanctions seek to prevent’ (Gordon, 1999, 133).  Let us consider this idea in a bit more detail.

Imagine a sender is indeed considering imposing economic sanctions on a state that is engaged in military aggression or human rights violations.  According to utilitarianism, the sender would be permitted (indeed, required) to impose economic sanctions if the sanctions were expected to result in less harm overall than any other means of ending the aggression/human rights violations (travel bans, military intervention and so forth) or, indeed, “doing nothing” and letting the aggression/violations continue unchecked.  Note that in making this utilitarian calculation, harms to sender citizens, target citizens and all other individuals affected are to be counted and weighed equally.

In order to determine whether economic sanctions are expected to result in the least harm in this case, we need to address two questions: (1) how harmful do we expect the economic sanctions to be? and  (2) what is the probability they will succeed in ending the human rights abuses?

(1) It is fair to say that, in general, economic sanctions are less harmful and destructive in their effects than military attack but more harmful and destructive than diplomatic measures (such as travel bans or withdrawing staff from embassies).  However, there will be exceptions.  For example, a targeted military strike might result in a lot less harm than collective, comprehensive sanctions.  It should not always be assumed that economic sanctions are less harmful than military action.  Senders should also take care to consider the full range of economic sanctions available to them: targeted sanctions may cause much less harm than collective sanctions but be equally effective.

(2) We also need to consider whether the economic sanctions will be successful at ending the human rights abuses.  It is important to take this into account.  If economic sanctions do not work, then the target citizens continue to suffer the human rights abuses whilst also suffering the economic sanctions.  It would have been better to not have imposed the sanctions at all.  From a utilitarian point of view, it is wrong to impose economic sanctions if it is expected that they will fail or that they are very likely to fail.  Since economic sanctions often have quite a low probability of success then, at least in the case of more harmful comprehensive sanctions, they will often be ruled out on utilitarian grounds.  Of course, this would need to be considered on a case by case basis.  Gordon finds the ineffectiveness of economic sanctions particularly troubling, and claims it is unlikely any particular episode of comprehensive sanctions would be justified on utilitarian grounds (Gordon, 1999, 137).

Finally, senders also need to remember that economic sanctions—especially those using an economic pressure mechanism—often take years to work.  Military intervention might be a faster way of ending the human rights abuses and consequently be the action that results in the least harm overall.  In such a case, utilitarianism would demand military intervention, not economic sanctions.

d. “Clean Hands”

Conventionally, economic sanctions are conceptualised as being measures designed to change the objectionable/unlawful behaviour of targets (or perhaps to punish it).  However, Noam Zohar, drawing on Jewish theological tradition, argues in favour of an alternative way of thinking about economic sanctions—that of economic sanctions as a method of ‘preserving clean hands’.

Under a ‘clean hands’ sanctioning policy, the objective of the economic sanctions is not to change a target’s behaviour or to punish it but rather to avoid complicity in that behaviour.  Zohar argues, for example, that if one state sells weapons—or allows weapons to be sold by its citizens—to a second state where it knows or suspects those weapons will be used to commit human rights violations, then it facilitates those violations and is thus morally responsibility for them as an accomplice.  Hence states have a duty to impose arms embargoes (a type of economic sanction) on targets that they suspect would use those arms to commit human rights violations.  Furthermore, clean hands sanctions are not restricted to arms embargoes; Zohar argues that embargoes would be required on all goods which would facilitate wrongdoing.  For example, he argues that there is a requirement to prevent oil exports to a state whose military is engaged in ethnic cleansing as oil would be necessary to fuel tanks, planes and so on. (Zohar, 1993).  Zohar’s analysis is restricted to cases where a state is violating the human rights of its own citizens.  However, it can easily be extended to cover cases where states are engaged in other types of wrongdoing, for example, pursuing aggressive war.

Zohar’s idea is interesting because to date the moral analysis of economic sanctions has almost exclusively assumed that economic sanctions are a prima facie wrong and that their use requires moral justification.  However, under a clean hands conception of economic sanctions the imposition of sanctions is, by contrast, a moral duty—a duty derived from the duty not to be complicit in human rights violations.  Employing the clean hands conception of economic sanctions thus shifts the burden of moral justification from those who would impose sanctions to those who would not.  The clean hands conception therefore appears to be a valuable tool for those who would impose economic sanctions in response to international wrongdoing.  However, attractive as it may be, there are some difficulties with Zohar’s view (some of which he acknowledges himself).

The first relates to Zohar’s conception of complicity in wrongdoing.  For Zohar, mere suspicion that the goods in question will be used for activities that violate human rights is sufficient to deem the exporting state complicit in the violations.  This view of complicity is controversial.  Many would argue that an accomplice to a crime must intend—or at least know—that the goods they are supplying will be used to commit a crime.  To designate a person an accomplice on the grounds of mere suspicion, they argue, would appear to make one responsible for the crimes of other people, people over whom one has no control.  If it cannot be said that the exporting state is complicit in cases of suspicion, then it cannot be said that it has a duty to sanction in these cases (at least not on the grounds that sanctioning would avoid complicity in wrongdoing).  This view of complicity would restrict Zohar’s clean hands argument to cases where the exporting state intends or knows the goods supplied will be used in human rights violations.

Second, there is the question of which goods can be said to facilitate human rights violations.  It seems obvious that weapons directly facilitate all kinds of human rights violations.  But what about other goods?  What about food for example?  Without food, no military (or any other organisation) can operate.  Does this mean that in cases where a state is engaged in human rights violations, there is a duty to sanction food exports?  The clean hands argument would seem to suggest there is.  For many, however, this conclusion would be too extreme.

Another serious problem relates to the question of dual-use goods.  These are goods which have both military and civilian uses.  To borrow Zohar’s example, oil may be used to fuel a campaign of ethnic cleansing but it may also be used to heat homes in winter.  In cases of multi-lateral sanctions, such as those imposed by the UN, a ban on oil exports could cause civilians to freeze to death (as—in theory at least—no state would sell them oil).  Should the UN sanction oil to avoid complicity in ethnic cleansing or should it continue to allow the export of oil to avoid civilians freezing to death?   Zohar tentatively suggests that in such cases there may be a duty to engage in a limited military action designed to ensure oil exports are used purely by civilians.  This would allow the exporting states to avoid complicity in the ethnic cleansing without causing civilians to freeze to death.  He suggests this role could be taken on by the United Nations.

The problem with this suggestion is twofold.  First, the limited military action suggested may simply not be possible.  The importing state may simply take the oil by force from the UN.  Second, even if limited military action were possible, a positive argument would still be required for this course of action.  The fact that it resolves the dilemma is not by itself a positive argument in favour given that other methods may also resolve the dilemma, for example, full scale military intervention, and so forth.

e. Summary

Economic sanctions raise serious moral questions that have largely been ignored by philosophers and political theorists.  The existing literature on the ethics of economic sanctions, whilst important and illuminating, barely scratches the surface of the subject.  Further research in this area is required. There is scope to consider the four frameworks outlined above in more detail and to critique their application and/or the conclusions reached under each of them.  There is also scope to develop entirely new frameworks for the moral assessment of economic sanctions.

3. References and Further Reading

a. On the Nature of Economic Sanctions

  • Andreas, Peter, ‘Criminalizing Consequences of Sanctions: Embargo Busting and its Legacy’, International Studies Quarterly, 49, 2005
  • Baldwin, David, ‘The Sanctions Debate and the Logic of Choice’, International Security, 24, 1999/2000
  • Baldwin, David and Pape, Robert ‘Evaluating Economic Sanctions’, International Security, 23, 1998
  • Baldwin, David, Economic Statecraft, (Princeton: Princeton University Press, 1985)
  • Cortright, David & Lopez, George A., Smart Sanctions: Targeting Economic Statecraft, (Lanham Md:  Rowman & Littlefield, 2002)
  • Cortright, David & Lopez, George A., The Sanctions Decade: Assessing UN Strategies in the 1990s, (London: Lynne Rienner Publishers, Inc., 2000)
  • Crawford, Neta C. & Klotz, Audie, How Sanctions Work: Lessons from South Africa (Basingstoke: MacMillan Press Ltd, 1999)
  • Doxey, Margaret, International Sanctions in Contemporary Perspective (Basingstoke: MacMillan, 1987)
  • Elliot, Kimberly Ann, ‘The Sanctions Glass: Half Full or Completely Empty?’, International Security, Vol. 23, No.1, 1998
  • Galtung, John, ‘On the Effects of International Economic Sanctions: With Examples from the Case of Rhodesia’, World Politics, Vol. 19, Issue 3, 1967
  • Gordon, Joy, Invisible War, (Harvard University Press, 2010)
  • Hufbauer, Gary, Jeffrey Schott, and Kimberly Ann Elliott, Economic Sanctions Reconsidered, 3rd edition, (Washington, Peterson Institute for International Economics, 2007)
  • Pape, Robert A., ‘Why Economic Sanctions Do Not Work’, International Security, Vol. 22, No. 2, 1997
  • Pape, Robert, ‘Why Economic Sanctions Still Do Not Work’, International Security, Vol. 23, No. 1, 1998
  • Peksen, Dursun and Drury, Cooper A., ‘Coercive or Corrosive?: The Negative Impact of Economic Sanctions on Democracy’, International Interactions: Empirical and Theoretical Research in International Relations, 36, 2010
  • Peksen, Dursun and Drury, Cooper A., ‘Economic Sanctions and Political Repression: Assessing the Impact of Coercive Diplomacy on Political Freedoms’, Human Rights Review, 10, 2009
  • Wood, Reed M., ‘A Hand Upon the Throat of the Nation: Economic Sanctions and State Repression, 1976–2001’, International Studies Quarterly, 52, 2008

b. On the Ethics of Economic Sanctions

  • Amstutz, Mark, International Ethics: Concepts, Theories, and Cases in Global Politics, 4th edition, (Lanham: Rowman & Littlefield Publishers Inc), 2013, Chapter 10
  • Christiansen, Drew & Powers, Gerard, F. ‘Economic Sanctions and Just War Doctrine’, in Cortright and Lopez (eds.), Economic Sanctions: Panacea or Peacebuilding? (Oxford: Westview Press, 1995)
  • Clawson, Patrick, ‘Sanctions as Punishment, Enforcement and Prelude to Further Action’, Ethics and International Affairs, 7, 1999
  • Damrosch, Lori Fisler, ‘The Collective Enforcement of International Norms through Economic Sanctions’, Ethics and International Affairs, 8, 1994
  • Ellis, Elizabeth, ‘The Ethics of Economic Sanctions’, PhD Thesis, University of Edinburgh, Edinburgh, 2013
  • Gordon, Joy, ‘Smart Sanctions Revisited’, Ethics and International Affairs, 25, 2011
  • Gordon, Joy, ‘A Peaceful, Silent, Deadly Remedy: The Ethics of Economic Sanctions’, Ethics and International Affairs, 13, 1999
  • Lang, Anthony F., Punishment, Justice and International Relations: Ethics and Order after the Cold War, (London: Routledge, 2008), Chapter 5
  • Nossal, Kim Richard, ‘International Sanctions as International Punishment’, International Organization, Vol. 43, No. 2, 1989
  • Pierce, Albert C, ‘Just War Principles and Economic Sanctions’, Ethics and International Affairs, 10, 1996
  • Winkler, Adam, ‘Just Sanctions’, Human Rights Quarterly, 21, 1999
  • Zohar, Noam, ‘Boycott, Crime and Sin: Ethical and Tulmudic Responses to Injustice Abroad’, Ethics and International Affairs, Vol. 7, 1993

c. Other Referenced Works

  • Coates, A.J, The Ethics of War (Manchester: Manchester University Press, 1997)
  • Rodin, David, War and Self Defence, (Oxford: Oxford University Press, 2002)
  • Walzer, Michael, Just and Unjust Wars: A Moral Argument with Historical Illustrations, 4th edition (New York: Basic Books, 2006)

 

Author Information

Elizabeth Ellis
Email: E.A.Ellis@leeds.ac.uk
University of Leeds
United Kingdom

Presocratics

Presocratic philosophers are the Western thinkers preceding Socrates (c. 469-c. 399 B.C.E.) but including some thinkers who were roughly contemporary with Socrates, such as Protagoras (c. 490-c. 420 B.C.E.). The application of the term “philosophy” to the Presocratics is somewhat anachronistic, but is certainly different from how many people currently think of philosophy. The Presocratics were interested in a wide variety of topics, especially in what we now think of as natural science rather than philosophy. These early thinkers often sought naturalistic explanations and causes for physical phenomena. For example, the earliest group of Presocratics, the Milesians, each proposed some material element ¾ water, air, the “boundless,” as the basic stuff either forming the foundation of, or constituting, everything in the cosmos.

Such an emphasis on physical explanations marked a break with more traditional ways of thinking that indicated the gods as primary causes. The Presocratics, in most cases, did not entirely abandon theistic or religious notions, but they characteristically posed challenges to traditional ways of thinking. Xenophanes of Colophon, for example, thought that most concepts of the gods were superficial, since they often amount to mere anthropomorphizing. Heraclitus understood sets of contraries, such as day-night, winter-summer, and war-peace to be gods (or God), while Protagoras claimed not to be able to know whether or not the gods exist. The foundation of Presocratic thought is the preference and esteem given to rational thought and argumentation over mythologizing. This movement towards rationality and argumentation would pave the way for the course Western thought.

Table of Contents

  1. On “Presocratic” and the Sources
    1. The Sources
  2. The Milesians
    1. Thales
    2. Anaximander
    3. Anaximenes
  3. Xenophanes
  4. Pythagoras and Pythagoreanism
  5. Heraclitus
  6. Eleatic Philosophy
    1. Parmenides
      1. The Path of Being
      2. The Path of Opinion
    2. Zeno
      1. Arguments against Plurality
      2. Dichotomy
      3. Infinite Divisibility and Arguments against Motion
    3. Melissus
  7. Philosophies of Mixture
    1. Anaxagoras
    2. Empedocles
      1. Macrocosm
      2. Microcosm
  8. The Atomists
    1. Ontology
    2. Perception and Epistemology
    3. Ethics
  9. Diogenes of Apollonia
  10. The Sophists and Anonymous Sophistic Texts
    1. Protagoras
    2. Gorgias
    3. Antiphon
    4. Prodicus
    5. Anonymous Texts
  11. Conclusion
  12. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. On “Presocratic” and the Sources

Difficulties are perhaps inevitable any time we lump a group of variegated thinkers under one name. The so-called “Presocratic philosophers” were a group of different thinkers hailing from different places at different times, many who of whom thought about different things. To call them all “Presocratic” thinkers can seem too sweepingly broad and inaccurate, or insensitive to the differences between each of the thinkers. Even, and perhaps especially, where there are similarities, “Presocratic” seems unsatisfactory. For, where the thought of different people deals with similar ideas, a specific name seems appropriate for that group of people. This happens in Presocratic philosophy (for example, the Milesians), but those specific names are treated merely as species of the larger genus that we call “Presocratic philosophy.”

There are also historical difficulties with the term. For example, the atomist Democritus—traditionally considered to be a Presocratic—is supposed to have been approximately contemporary with Socrates. Continuing on with the use of the term then, should be a tentative and careful endeavor. Whatever the case, these thinkers set Western philosophy on its path.

a. The Sources

We have no complete writings from any of the Presocratics, and from some, nothing at all. Our sources, then, are primarily twofold: fragments and testimonia. The fragments are purported bits of the thinkers’ actual words. These might be fragments of books that they wrote, or simply recorded sayings. In any case, there are no surviving complete works from the Presocratics. Moreover, it is important to remember that there are no original compositions—of any length or degree of completeness—available. Neither, for that matter, are any originals available from Plato or Aristotle. In the pre-printing press days, scribes copied whatever editions of books and other written works they had available to them. We have texts that have been copied many times over. This means that, even with the fragments, we can never be sure whether or not the words we are reading correspond exactly to the original ideas that the Presocratics expressed.

The ancient testimonies come to us from several sources, each having its own agenda and degree of reliability. Both Plato and Aristotle explicitly name many of the Presocratics, sometimes discussing their supposed ideas at length. We must recognize that both Plato and Aristotle almost certainly treated Presocratic thought in light of their own respective philosophical agendas. Therefore, the information we get from them about the Presocratics is likely skewed and sometimes arrantly false. Plato wrote philosophical-literary dialogues, and likely needed to represent the Presocratics in his own peculiar ways to meet the needs of the dialogues. Aristotle, who wrote in the treatise style to which we are more accustomed today, also references the Presocratics in the context of his own philosophy. Aristotle would set out to write on a particular topic (for example, physics), and would survey the ideas of his predecessors on that same topic. In doing so, he at times agreed with their positions, and often disagreed with them. We have to beware, especially where Aristotle disagreed with his predecessors, of a possible (and possibly intentional) straw-man technique that Aristotle might have employed to advance his own position. Thus, while the accounts of Plato and Aristotle can be useful, we should read them cautiously.

2. The Milesians

While it might be inaccurate to call them a school of thinkers, the Milesian philosophers do have connections that are not merely geographical. Hailing from Miletus in Ionia (modern day Turkey), Thales, Anaximander, and Anaximenes each broke with the poetic and mythological tradition handed down by Hesiod and Homer. With what little we know about the Milesians, we do not consider them philosophers in the same way that we consider Plato, Aristotle, and their successors philosophers. Much of what we know about them suggests that they were protoscientists, concerned with cosmogony, which wasthe generation of the cosmos; and cosmology, the study of or inquiry into the nature of the cosmos. Their cosmogonies and cosmologies are oriented primarily by naturalistic explanations, descriptions, and conjectures, rather than traditional mythology. In other words, the Milesians ostensibly sought to explain the cosmos on its own terms, rather than pointing to the gods as the causes or progenitors of all natural phenomena.

The geographical placement of Miletus is noteworthy. It is not unlikely that someone like Thales, for example, travelled to Egypt and perhaps to Babylon. Indeed, there is great evidence to suggest that the Babylonians, in some fashion or another, contributed significantly to ancient Greek knowledge of astronomy and mathematics. This is important to keep in mind when considering Presocratic discoveries in astronomy, mathematics, and other fields. There is scant evidence to suggest that this or that Presocratic thinker was the sole inventor or discoverer in any particular scientific finding or field.

a. Thales

Typically considered to be the first philosopher in the history of Western philosophy, Thales (c. 624-c. 545 B.C.E.) is a figure surrounded by legend and anecdotes. The historian Herodotus says that Thales proposed a single congress for Ionia, effectively centralizing the governmental powers, and making Ionia a single state (Graham 23). In a Lydian military campaign, he is supposed to have diverted the Halys river so that the Lydian military could safely cross in the absence of bridges (Graham 25). Aristotle relays another story, claiming to show us how Thales defended himself and philosophers against claim that philosophers are useless. Through astronomy, Thales was purportedly able to predict a good olive harvest for a particular year. That winter, he bid on the region’s olive presses, and since no one bid against him (they apparently found his prediction incredible), he put down only a small sum. “When harvest time came and everyone needed the presses right away, he charged whatever he wished and made a good deal of money—thus demonstrating that it is easy for philosophers to get rich if they wish, but that is not what they care about” (Graham 25). Plato relates the humorous story that Thales fell into a well while stargazing. “A Thracian servant girl with a sense of humor…made fun of him for being so eager to find out what was in the sky that he was not aware of what was in front of him right at his feet” (Graham 25). Thus, this might be the first anecdote of the impractical and incompetent philosopher who proves himself practically competent, but ultimately unconcerned with worldly affairs.

While we have no way of knowing whether or not any of these stories square with the facts, they paint a picture of Thales as a practical and theoretical wise man—a picture that attracted the eyes of most ancient authorities. He is said to have predicted a solar eclipse in 585, helping the Ionians in battle, since he informed them of the coming darkness, and the enemy was, literally, left in the dark (Graham 23). It is also reported that Thales was highly influential in his work in geometry, if not being entirely responsible for introducing it to Greece from Egypt. Indeed, he is supposed to have discovered that two triangles sharing a side and having equal adjacent angles are congruent (Graham 35), that a circle is bisected by its diameter (Graham 33), and that angles at the base of two isosceles triangles are equal (Graham 35).

Perhaps because of Thales, Milesian philosophy has running through it a taste for the first principles or beginnings of the cosmos. Thales supposed the principle or source (arche) of all things to be water. Aristotle guesses some reasons why Thales might have believed this (Graham 29). First, all things seem to derive nourishment from moisture. Next, heat seems to come from or carry with it some sort of moisture. Finally, the seeds of all things have a moist nature, and water is the source of growth for many moist and living things. Some assert that Thales held water to be a component of all things, but there is no evidence in the testimony for this interpretation. It is much more likely, rather, that Thales held water to be a primal source for all things—perhaps the sine qua non of the world.

It is unclear just how far we are to take Thales here, or precisely how, or if, water plays a role in every cosmological phenomenon. While Thales did turn to naturalistic explanations of the cosmos, he did not abandon belief in the gods. He was supposed to have thought that “all things are full of gods,” and that water is pervaded by a divine power, which also moves the water (Graham 35). If all things either are water, or can ultimately be traced in some way to water, water itself becomes divine—it is the life of the universe, and thus all things are in some way divine. Moreover, if water is more or less connected with some particular thing in the cosmos, then it would stand to reason that some things are more or less divine. As Aetius testifies, “Thales said that God is the mind of the world, and the totality is at once animate and full of deities. And a divine power pervades the elemental moisture and moves it” (Graham 35)). Thales, then, did not abandon theology in favor of naturalism, but rather radically modified it.

b. Anaximander

Anaximander (c. 610-c. 545 B.C.E.) followed in Thales’ footsteps (he might have been Thales’ student) by applying his astronomical knowledge to practical life on earth. He was supposed to have invented the gnomon, a simple sundial (Graham 49). He may have introduced the knowledge of the solstices and equinox to the Greeks, as well as the twelve-hour division of the day—knowledge he probably gained from the Babylonians (Graham 49). He travelled extensively, gaining first-hand geographical knowledge. Indeed, he was supposed to have drawn a map of the earth as he knew it (Graham 49).

Like Thales, Anaximander also posited a source for the cosmos, which he called the boundless (apeiron). That he did not, like Thales, choose a typical element (earth, air, water, or fire), shows that his thinking had moved beyond the more possibly evident sources of being. He might have thought that, since the other elements seem more or less to change into one another, there must be some source beyond all these—a kind of background upon or source from which all these changes happen. Indeed, this everlasting principle gave rise to the cosmos by generating hot and cold, each of which “separated off” from the boundless. How it is that this separation took place is unclear, but we might presume that it happened via the natural force of the boundless. The universe, though, is a continual play of elements separating and combining. In poetic fashion, Anaximander says that the boundless is the source of beings, and that into which they perish, “according to what must be: for they give recompense and pay restitution to each other for their injustice according to the ordering of time” (F1).

In the generation of the cosmos as we know it now, human beings came to be from other animals. While it would be inaccurate to call Anaximander the father of the theory of evolution, the history of that theory should at least make mention of his name. Anaximander thought that human beings could not have been at their origin the way that they are now. That is, they must have arisen from some other animals, since human beings need longer stretches of time for nurture than other animals. They could not have survived, he reasons, without the generative help of other animals (Graham 57). He thought that human beings arose from or were at least akin to fish (Graham 59). Beyond this, humans seem to have needed moisture and heat for their generation. More specifically, humans originated with moisture in some sort of shell, and eventually matured, moved onto land, “and survived in a different form for a short while” (Graham 63). What evidence Anaximander might have had to support these claims we can only guess, but his willingness to explain the world on its own terms, without recourse to divine generation or intervention (although he might well have considered the boundless to be divine), is the mark of a new way of thinking.

c. Anaximenes

If our dates are approximate, Anaximenes (c.546-c.528/5 B.C.E.) could have had no direct philosophical contact with Anaximander. However, the conceptual link between them is undeniable. Like Anaximander, Anaximenes thought that there was something boundless that underlies all other things. Unlike Anaximander, Anaximenes made this boundless thing something definite—air. For Anaximander, hot and cold separated off from the boundless, and these generated other natural phenomena (Graham 79). For Anaximenes, air itself becomes other natural phenomena through condensation and rarefaction. Rarefied air becomes fire. When it is condensed, it becomes water, and when it is condensed further, it becomes earth and other earthy things, like stones (Graham 79). This then gives rise to all other life forms. Furthermore, air itself is divine. Both Cicero and Aetius report that, for Anaximenes, air is God (Graham 87). Air, then, changes into the basic elements, and from these we get all other natural phenomena. This means that ostensibly qualitative properties of things, for example, hot-cold, hard-soft, and so forth, are reducible to quantitative properties (McKirahan 51). Since air is boundless, it does not have a beginning or end, but is in a constant state of flux. Air is the morphological thread binding all things together.

A rather convenient psychological takeaway from Anaximenes’ theory is that the soul (psychê), traditionally considered to be breath, is itself airy (Graham 87). So, the individual human soul is in some way divine since each human being partakes of air. Again, it is remarkable that Anaximines, like his fellow Milesians, did not have recourse to Homeric or Hesiodic mythology to explain the world. The Milesians arguably stand at the beginning, at least as the testimony and scant textual evidence has it, of a distinct way of thinking that we consider to be scientific, however primitive it may be. Despite this inclination toward naturalistic explanations of the world, they considered the gods to be thoroughly infused with their world. With the Milesians comes a radical shift in thought. The radical nature of their thinking does not depend upon a rejection of all divinity, but a reformation in the way we think about it. This leads us to Xenophanes, who first explicitly formulated a critique of traditional ways of thinking about divinity.

3. Xenophanes

Xenophanes (c. 570-c. 478 B.C.E.) was from Colophon, north of Miletus in Ionia. He did not remain in Colophon, but travelled around Greece reciting his poetry, finally settling in modern day Sicily. Since his views were expressed poetically, it is at times difficult to know how to interpret them. Thus, we should keep in mind that, while we have more fragmentary material from Xenophanes than all of the Milesians taken together, the way in which his views were expressed, and the fragmentary nature of our sources, prevents us from being certain about what exactly he meant. What exposure he might have had to Milesian thought we do not know. Like the Milesians, however, he challenged traditional theological views, but in a new way. Even his social views seem to have been at odds with the ancient Greek sensibilities. For example, he renounces the glorification and honorific status of athletes, saying that wisdom should be preferred (F2).

Unlike the Milesians (or the evidence we have of them), Xenophanes directly and explicitly challenged Homeric and Hesiodic mythology. “It is good,” says Hesiod, “to hold the gods in high esteem,” rather than portraying them in “raging battles, which are worthless” (F2). More explicitly, “Homer and Hesiod have attributed to the gods all things that are blameworthy and disgraceful for human beings: stealing, committing adultery, deceiving each other” (F17). At the root of this poor depiction of the gods is the human tendency towards anthropomorphizing the gods. “But mortals think gods are begotten, and have the clothing, voice and body of mortals” (F19), despite the fact that God is unlike mortals in body and thought. Indeed, Xenophanes famously proclaims that if other animals (cattle, lions, and so forth) were able to draw the gods, they would depict the gods with bodies like their own (F20). Beyond this, all things come to be from earth (F27), not the gods, although it is unclear whence the earth came. The reasoning seems to be that God transcends all of our efforts to make him like us. If everyone paints different pictures of divinity, and many people do, then it is unlikely that God fits into any of those frames. So, holding “the gods in high esteem” at least entails something negative, that is, that we take care not to portray them as super humans.

We have seen what the gods are not, but what is God or the gods? It is unclear whether or not Xenophanes was a theological monist or pluralist, but he seems at least to hint at either one God only, or one God above all others. “One God, greatest among gods and men…” (F23) could mean that there is one God only, despite the fact that mortals talk about a plurality of Gods, or that there is one God who is greater than all the rest. This God, in his entirety, sees, thinks, hears, and shakes all things by the thought of his mind (F24-F25). He remains, unmoving, in the same place (F26). If God is in some place, does this not mean that he is embodied? This is unclear, but Aristotle claims that Xenophanes thought of God as spherical, presumably based upon the picture of uniformity portrayed in the preceding fragments (Graham 113). We might also wonder whether or not this depiction of God, too, is in some way anthropomorphizing. How do we know that God has a mind, or that he hears, sees, and thinks? Xenophanes does not present us with answers to these questions. Whatever the case, Xenophanes’ God is unlike any previous conceptions of divinity, and seems to have set in motion a long tradition of critical and rational theology.

Ultimately, we can never know the full and simple truth about the gods or anything else. Even if we successfully describe events in our world, we cannot claim knowledge about such things; for, “opinion is wrought over all” (F35). This, however, apparently does not prevent us, through an effort of seeking, from understanding things better. If Xenophanes is a skeptic, therefore, his skepticism is pliable and open-ended. By rejecting dogmas, Xenophanes is willing to make rational conjectures about God.

4. Pythagoras and Pythagoreanism

Ancient thought was left with such a strong presence and legacy of Pythagorean influence, and yet little is known with certainty about Pythagoras of Samos (c. 570-c. 490 B.C.E.). A great deal of legend surrounds the life of Pythagoras. Scholars generally agree that Pythagoras left Samos for Croton, where he enjoyed political esteem as a ruler. His political success, however, was not his philosophical legacy, but instead the almost religious following that developed in his name (perhaps because of his political success). He developed a following that continued long past his death, on down to Philolaus of Croton (c. 470-c. 399 B.C.E.), a Pythagorean from whom we may gain some insight into Pythagoreanism. Whether or not the Pythagoreans followed a particular doctrine is up for debate, but it is clear that, with Pythagoras and the Pythagoreans, a new way of thinking was born in ancient philosophy, and had a significant impact on Platonic thought.

Many know Pythagoras for his eponymous theorem—the square of the hypotenuse of a right triangle is equal to the sum of the squares of the adjacent sides. Whether Pythagoras himself invented the theorem, or whether he or someone else brought it back from Egypt, is unknown. He was accorded almost godlike status among his followers, some saying that there are three classes of rational beings: the gods, human beings, and beings like Pythagoras (Graham 921). He was said to have a golden thigh, to have been hailed by name by the river Cosas, and to have been seen simultaneously in both Metapontum and Croton (Graham 919). Empedocles sung his praises by saying that Pythagoras could, by the power of his mind, behold all things “for ten or even twenty generations of men” (Graham 917).

One doctrine that scholars confidently attribute to Pythagoras and his followers is the transmigration of souls. The soul, for Pythagoras, finds its immortality by cycling through all living beings in a 3,000-year cycle, until it returns to a human being (Graham 915). Indeed, Xenophanes tells the story of Pythagoras walking by a puppy who was being beaten. Pythagoras cried out that the beating should cease, because he recognized the soul of a friend in the puppy’s howl (Graham 919). Another Pythagorean view seems not to have restricted a life cycle to souls, but widened the scope to all things, such that there is nothing completely new, since everything has happened before and will happen again (Graham 919). What exactly the Pythagorean psychology entails for a Pythagorean lifestyle is unclear, but we pause to consider some of the typical characteristics reported of and by Pythagoreans.

Pythagoreans were famous for their silence (Graham 911). Their teachings were transmitted cryptically, and it is unclear how strict of a doctrine the followers were demanded to observe. Some are reported to have refrained from eating or handling beans, either because they resemble genitals or the gates of Hades. Some were commanded not to sacrifice a white rooster, since white symbolized purity and goodness, and because roosters are sacred to Men, and thus roosters announce the sunrise in the morning (Graham 923). There were also the akousmatikoi (things heard), which were expressed in three categories: what something is, what the most x is (for example, what is the wisest?), and what one should or should not do (e.g abstention from beans or sacrificing white cocks). The Oracle at Delphi was said to be the tetractys and, therefore, harmony, which satisfies the first set of akousmatikoi. Number is said to be the wisest, with giving names to things coming in second for wisdom (Graham 923).

Plato and Aristotle tended to associate the holiness and wisdom of number—and along with this, harmony and music—with the Pythagoreans (Graham 499). For example, the decad was sacred. The tetractys shows us the holiness of the number ten.

presocratic triangle

Here, we can see a relationship among numbers, all of which leads us to a figure. There is the one, which begets plurality (two). When we add three and four to these, there is the sum of ten, which signifies the composition of the cosmos (Graham 499). There were nine visible heavenly bodies, and so the Pythagoreans posited a tenth body, counter-earth, to balance out the cosmos. The tetractys also gives us the ratios of harmony: 1:2, 2:3, and 3:4, or the octave, the fifth, and the fourth, respectively (McKirahan 92). The universe is harmony, and Philolaus considered the soul also to be a harmony (Graham 505). Thus, at least for Philolaus, the soul could be considered to be a type of microcosm.

Perhaps more basic than number, at least for Philolaus, are the concepts of the limited and unlimited. Nothing in the cosmos can be without limit (F1), including knowledge (F4). Imagine if nothing were limited, but matter were just an enormous heap or morass. Next, suppose that you are somehow able to gain a perspective of this morass (to do so, there must be some limit that gives you that perspective!). Presumably, nothing at all could be known, at least not with any degree of precision, the most careful observation notwithstanding. Additionally, all known things have number, and number is classed in two kinds: odd and even (F6). Number, too, can be seen here as a kind of limiter. Each thing is one, and thus separate from other things.

There is evidence to suggest that some Pythagoreans gave credence to a list of opposites in addition to limit-unlimited and odd-even: one-plurality, right-left, male-female, rest-motion, straight-bent, light-dark, good-evil, square-oblong. The left side of each of these binaries would be organized in one column, while the right side would be organized in a parallel column. Although it is unclear how, these columns of opposites somehow give us insight into the basic stuff of the cosmos and of being. Notice also that there are ten pairs of opposites. Limit-unlimited and odd-even are listed first, and these give rise to the rest of the cosmos (McKirahan 97). Thus, the Pythagoreans saw a universe whose nature is numerical, but also one in the tension of harmony, and similar to Heraclitus, the tension of opposites.

5. Heraclitus

Just south of Colophon in Ionia was Ephesus, where yet more new philosophical blood was circulating. Heraclitus (c. 540-c. 480 B.C.E.) stands out in ancient Greek philosophy not only with respect to his ideas, but also with respect to how those ideas were expressed. His aphoristic style is rife with wordplay and conceptual ambiguities. Heraclitus was getting at what he saw as a reality composed of contraries—a reality, too, whose continual process of change is precisely what keeps it at rest. Such a unique style of thought and expression seems to have sprung forth from a life just as unique, and perhaps even contrarian. While we often do well to proceed cautiously with Diogenes Laertius’ accounts of the philosophers, his account of Heraclitus is telling, and fits with Heraclitus’ sometimes scathing thought. Diogenes Laertius calls him “conceited” and “haughty,” citing as evidence Heraclitus’ denunciation of Hesiod, Pythagoras, Xenophanes, and Hecataeus as people who have learned much (literally, polymaths), but understand little. Diogenes Laertius says that Heraclitus “studied with no one, but asserted he inquired of himself and learned everything by himself” (Graham 139). Indeed, when reading Heraclitus, one can easily imagine a loner whose originality of thought was closely linked with, if not born from, that solitude.

He is often critical of the ignorance—that is, the lack of genuine understanding—of the majority of human beings. He speaks of a logos (translatable as “word,” “reason,” “rationality,” “language,” “ratio,” and so forth) that most human beings do not understand, neither before nor after they hear it. Many people are asleep, despite being awake. “Having heard without comprehension, they are like the deaf; this saying bears witness to them: present they are absent” (F6). Pronouncing a sentiment further echoed in Plato and Aristotle, Heraclitus says, “the many are base, while the few are noble” (F12). Most people do not observe the world carefully enough, and few attain a true understanding of it. There is in Heraclitus a distinction between having much information under one’s belt, and understanding how all of it fits together, what it all means, that is, its overall significance.

One might wonder whether or not God, for Heraclitus, is synonymous with reality, so that a real understanding of the universe is an understanding of what is sacred. God is “day night, winter summer, war peace, satiety hunger…” (F103). Fire plays a significant role in his picture of the cosmos. No God or man created the cosmos, but it always was, is, and will be fire. At times it seems as though fire, for Heraclitus, is a primary element from which all things come and to which they return. At others, his comments on fire could easily be seen metaphorically. What is fire? It is at once “need and satiety.” This back and forth, or better yet, this tension and distension is characteristic of life and reality—a reality that cannot function without contraries, such as war and strife. “A road up and down is one and the same” (F38). Whether one travels up the road or down it, the road is the same road. “On those stepping into rivers staying the same other and other waters flow” (F39). In his Cratylus, Plato quotes Heraclitus, via the mouthpiece of Cratylus, as saying that “you could not step twice into the same river,” comparing this to the way everything in life is in constant flux (Graham 158). This, according to Aristotle, supposedly drove Cratylus to the extreme of never saying anything for fear that the words would attempt to freeze a reality that is always fluid, and so, Cratylus merely pointed (Graham 183). Whether or not this is a fair interpretation of Heraclitus, we can see that change plays a central role in his thought. Yet, Heraclitus recognizes that “changing it rests” (F52). So, the cosmos and all things that make it up are what they are through the tension and distention of time and becoming. The river is what it is by being what it is not. Fire, or the ever burning cosmos, is at war with itself, and yet at peace—it is constantly wanting fuel to keep burning, and yet it burns and is satisfied.

6. Eleatic Philosophy

Three important thinkers fall under the category of Eleatic thought: Parmenides, Zeno, and Melissus. The latter was not from Elea as the former two were, but his thought directly inherits the monism typical of Parmenides and Zeno. Thus, Melissus will be treated in this section after Parmenides and Zeno.

a. Parmenides

If it is true that for Heraclitus life thrives and even finds stillness in its continuous movement and change, then for Parmenides (c. 515-c. 450 B.C.E.) life is at a standstill. Haling from Elea (a Greek colony in modern day Italy), and the father of Eleatic philosophy, Parmenides was a pivotal figure in Presocratic thought, and one of the most influential of the Presocratics in determining the course of Western philosophy. According to McKirahan, Parmenides is the inventor of metaphysics (157)—the inquiry into the nature of being or reality. While the tenets of his thought have their home in poetry, they are expressed with the force of logic. The Parmenidean logic of being thus sparked a long lineage of inquiry into the nature of being and thinking.

Parmenides’ poem moves in three parts: a sort of foreword (proemium), a section on Truth, and a section on Opinion (the way of mortals). The narrator of the poem describes allegorically a journey in a chariot, led speedily along by mares, but guided by maidens from the House of Night. He was led to the threshold of the paths of Night and Day, where Justice holds the keys that open the door to each. The maidens persuaded Justice, with gentle words, to open the door between Night and Day, whereupon the travellers were greeted by a goddess, who claims to teach the only paths for thought: “the one: that it is and that it is not possible not to be, is the path of Persuasion (for she attends on Truth); the other: that it is not and that it is right it should not be, this I declare to you is an utterly inscrutable track, for neither could you know what is not (for it cannot be accomplished), nor could you declare it” (F2). The “inscrutable” track is the path of mortals (Opinion), while the former is the path of Truth. Curiously, the goddess urges the sojourner to learn both, claiming that “it is right for you to learn all things.” The goddess suggests that, although the path of Opinion is ultimately wrong-headed, it is nevertheless wise to understand why such a path is one to which many so often cling.

i. The Path of Being

The first path is the path of being. The Greek word esti(n) is the third person singular of the verb to be. It need not express a subject, and does not in Parmenides’ poem. We therefore import the English word “it” into the translation for smooth English. There is much debate about the way Parmenides uses to be in his poem, but the possibilities are these. First, he might have used esti in an existential sense, that is, that something simply exists (for example, Spot exists). Second, he might have meant esti in the predicative sense, for example, “the t-shirt is red.” Third, esti could take a sense of identity, as in, “A=B.” Fourth is the veridical sense, or, “it is true that X.” Finally, there could be some combination of some or all of these senses of esti (Sedley 114-115 and McKirahan, 160-163). Whatever the case, Parmenides does seem to have in mind the whole—all of being. As soon as we differentiate among types of beings, we have entered into the way of Opinion or plurality.

The right way of thinking is to think of what-is, and the wrong way is to think both what-is and what-is-not. The latter is wrong, and the goddess forbids it, simply because non-being is not. In other words, there is no non-being, so properly speaking, it cannot be thought—there is nothing there to think. We can think only what is and, presumably, since thinking is a type of being, “thinking and being are the same” (F3). It is only our long entrenched habits of sensation that mislead us into thinking down the wrong path. We are, as it were, “two-headed” and helpless in our ignorant journey down the path of Opinion, and we mistakenly think that being and non-being are the same.

The goddess names several characteristics of what-is. It is ungenerated and imperishable, whole and one, unperturbed, complete, completely present (without past or future), and continuous. Parmenides makes use of the Principle of Sufficient Reason to say that there is no sufficient reason for being, or what-is, to have been generated at this time or that (McKirahan 167). If at one time it came to be, that means that at one time it was not, which is impossible. It cannot not be, that is, what-is is necessary. Moreover, what-is is motionless, since motion would involve non-being, that is, changing in place or in quality requires going from what is to what is not. It is therefore the same all around and held within a limit, “which confines it round about” (F8.31). Parmenides goes so far as to compare it to a ball, maintaining balance and equal tension in all directions from the center out. It is thus complete. Is it problematic to have being bounded by a limit? Would this not mean that there is something outside being, effectively making what is outside its limits non-being? Apparently, we are to remain resolute in thinking of the sphere as complete and as all being, even though we mortals sometimes mistakenly divide it up, or conceive it as something inside a container.

ii. The Path of Opinion

Now the goddess presents the way of Opinion. She claims that her words about this way will be illusory or deceptive, meaning that the subject matter itself produces the deception. Mortals claim that there is both being and non-being. We observe the world with our senses, and put too much faith in these rather than in reason, which tells us that there is only one true way—being. Oddly, our interpretations of Parmenides become even more obscured when we reach this section. The reader is tempted to believe that Parmenides himself gave at least some degree of credence to mortal opinion. Indeed, we are told that Parmenides considered the earth and fire to be the sources of all that it is. Aristotle says that Parmenides does this in order to explain why, for reason, there is only one eternal being, while for the senses, there is a plurality of beings. Parmenides classified the hot, then, as what-is, and the cold as non-being (Graham 221).

Parmenides must in some way account for the fact that most human beings hold fast to the information that the senses provide. If most of us are in error, it is a subtle and elusive one. Since, by habit, we are so easily convinced of the truth of the senses, Parmenides attempts to explain why this is, and also attempts to give us a more intelligible account of the sensible world. The information we have does not present a clear picture of Parmenides’ vision of the cosmos, but it does give us some ideas of its nature. The hot is responsible for separation, and the cold is responsible for coalescence. Beyond this, Parmenides seems to have been a rather serious astronomer, whose astronomical theory in some important ways prefigures modern astronomy. He may have been the first Greek—the Babylonians already being privy to it—to have claimed the morning and evening star to be the same thing (Graham 225). He also claimed that the moon’s light is a reflection of the sun’s light. He may even have thought that the earth was spherical (Graham 241). Again, the earth, like being, has no reason to move this way or that, due to its equilibrium.

We see in Parmenides a reverence for reason. Even his cosmology is based upon reason rather than the senses alone. In a time before telescopes or any other sophisticated observational technology, Parmenides had to move beyond the evidence of the senses alone to determine that the morning and evening star is the same, and that the moon reflects the sun’s light. To all appearances, the moon somehow generates its own light. Parmenides, however, moved beyond appearances to explain appearances. For this very reason there is also tension in Parmenides’ thought. No matter how much faith we put in reason, and no matter how much we deny the evidence of the senses, the sensory world still convincingly thrusts itself upon us, and demands our thought, attention, and understanding. Perhaps in the end this understanding of the natural world, which to all appearances is a mixture of being and non-being, shows us a unified, eternal and simple being.

b. Zeno

Zeno (c. 490-c. 430 B.C.E.), also a native of Elea, was Parmenides’ student and possibly his boyfriend (homosexuality in ancient Greek culture was fairly common—among intellectuals, the student performed favors in order to receive the teacher’s wisdom). As Daniel Graham says, “Parmenides argues for monism, Zeno argues against pluralism” (Graham 245). That is, Zeno seems to have composed a text wherein he claims to show the absurdity of accepting that there is a plurality of beings. He uses arguments, often in a reductio ad absurdum form, to prove positively that there cannot be plurality, and negatively (or by an implied inference), that the only possibility is that what-is is one. Beyond this, he argued against motion and against place. Suffice to say, Zeno’s paradoxes have since his day provided problems for philosophers and mathematicians alike. Let us examine some of Zeno’s arguments.

i. Arguments against Plurality

Many of Zeno’s arguments can be dizzying. One argument contains an important claim upon which many other arguments have their foundation. There might have been an argument for this claim, but there is none extant (Graham 267). For the sake of clarity, Graham’s summary of this initial claim (claim (a) below) and following arguments will be quoted:

(a) If there are many things, no one has size because it is one and the same as itself.

(b) If each of the many did not have size, it would not exist, for if it were added to or subtracted from something, it would make no difference to that thing.

(c) If there are many things, each must have size and solidity, and hence each must have parts with size and solidity, and similarly each of these parts must have parts.

(d) Hence, if there are many things, they must be both small and large; so small as to have no size, and so large as to be unlimited (infinite). (267)

The set of arguments (b)-(d) is aiming to disprove plurality. These arguments seem somehow to be based upon (a), which seems to be the conclusion of an argument for which we have no premises. At the least, we can see here, if only obscurely, Zeno’s efforts to deny pluralism.

An argument from Plato’s Parmenides goes like this. If there are many things, then each thing will be both like and unlike, and so a contradiction ensues (F1). For example, body X will be like bodies Y and Z in that all three are bodies taking up space. Yet, each of the three will be unlike the other since, let us suppose, X is red, Y is blue, and Z is green. Thus, X is both like and unlike Y and Z. If this is all there was to Zeno’s argument, as Plato presents it (perhaps simply for the dramatic purposes of the dialogue), then it is not a contradiction, since each body is like and unlike the other in different respects (McKirahan 182).

Zeno shows that if we attempt to count a plurality, we also end up with an absurdity. If there are is a plurality, then there would be neither more nor less than the number that they are. Thus, there would be a finite number of things. On the other hand, if there is a plurality, then the number would be infinite, because there is always something else between existing things, and something else between those, and something else between those, ad infinitum. Thus, if there were a plurality of things, then that plurality

would be both infinite and finite in number, which is absurd (F4).

ii. Dichotomy

A central argument, at least in what we have available of Zeno’s work, is what the ancients called the argument from dichotomy. There are two versions of this argument. In the first, we suppose that what-is is divisible, and then we end up with two absurdities. If it is divisible, it will be divided down into a an infinite number of finite parts, or it will be divided so much that nothing at all is left over. The first option is less clear. Zeno probably has in mind that an infinite number of finite parts would go to make up something that is infinitely great in size when taken as a whole (as above). The second option is clearly absurd. Therefore, being or what-is is one and indivisible (Graham 259).

iii. Infinite Divisibility and Arguments against Motion

The idea of infinite divisibility plays a key role in many of Zeno’s arguments. For example, let us look at his arguments against motion. It is impossible for a body in motion to traverse, say, a distance of twenty feet. In order to do so, the body must first arrive at the halfway point, or ten feet. But in order to arrive there, the body in motion must travel five feet. But in order to arrive there, the body must travel two and a half feet, ad infinitum. Since, then, space is infinitely divisible, but we have only a finite time to traverse it, it cannot be done. Presumably, one could not even begin a journey at all. Aristotle criticized this argument by saying that there are two senses of “infinite” with reference to magnitudes: there is infinite divisibility and infinity with reference to extremes (Graham 261). We cannot get through an infinite quantity in a finite time, but one can get through an infinitely divisible space, because time is also infinitely divisible. If there is a parallel between the divisibility of space and time, then we can cross an infinitely divisible span of space, because there will be a bit of time measuring each bit of the motion in which to do it.

Similar to this argument is the Achilles argument. Swift-footed Achilles will never be able to catch up with the slowest runner, assuming the runner started at some point ahead of Achilles, because Achilles must first reach the place where the slow runner began. This means that the slow runner will already be a bit beyond where he began. Once Achilles progresses to the next place, the slow runner is already beyond that point, too. Thus, motion seems absurd.

Again, an arrow flying from point A to point B is actually not in motion. At each moment in its apparent flight, it occupies a place equal to its size. If something occupies a place equal to itself, it must be at rest, since nothing can be in a place equal to itself while in motion. Thus, the arrow is not actually in flight, but at rest in its place. Aristotle’s criticism here is that Zeno assumes time to be composed of indivisible moments or “nows.” Now the arrow is here, and now it is here, and now it is here, and so on. The other assumption of Zeno’s argument is that something is only in a place when it is at rest. He also argues against place, however, by saying that if something is in a place, then that place must be in a place, and that place must be in a place, ad infinitum. Thus, if everything is in a place, then there would be infinite places of those places, and this is absurd (Graham 261).

The most conceptually difficult argument is the Stadium or Moving Rows paradox. Suppose there is a set of bodies at one end of a racetrack and one at another. They will both move in opposite directions at equal speeds and will thereby run past one another. They will both pass by a third set of stationary bodies equal in size to the racing bodies. The Stadium paradox is often illustrated in the following way.

The Bs and Cs are in motion, while the As are stationary. The Bs and Cs are moving at an equal and constant rate of speed. Since their starting point is the middle A, so to speak, it should take the Bs and Cs twice as long to bypass each other as it takes them to bypass the As. That is, the rightmost B must move past only one A, while it must move past two Cs, and the leftmost C must move past two Bs, but only one A. The Cs and Bs have therefore moved across both a longer and a shorter distance at the same time; thus the contradiction (Graham 263). Aristotle, however, says that this reasoning is fallacious since the Bs and Cs are in motion. Since they are in motion, and moving at an equal speed, it will take them half as long to move past each other as it does to pass a stationary A (Graham 263). Some commentators, thinking that Zeno could not possibly have made such an egregious error, suppose that Zeno might have intended for each body in the row to be atomic, i.e., indivisible. If this were the case, then a B cannot move past only half of an A or a C (since they are indivisible), but must move past the whole body at once. Thus, Zeno’s paradox would remain intact, although we have no textual evidence that this is what Zeno had in mind (McKirahan 192).

The final paradox is the millet seed paradox, which is either given to us in an incomplete way, or is simply fallacious. If a bushel of millet seeds dropped, it will make a sound. If this is true, then one millet seed when dropped should also make a sound, and one ten thousandth of a part should as well. But this does not happen. As it is, there are two problems with this argument. On the surface, we do not know what Zeno meant to prove from this. Logically, the argument commits the fallacy of division. Just because the whole (the bushel) makes a sound when dropped, we cannot conclude that any given part (one ten thousandth of a seed) will as well (Graham 265). Whatever the case, the overall picture of Zeno is of his fight against plurality and motion for the sake of monism.

c. Melissus

We know little about Melissus’ life except that he was an admiral and organized a battle against the Athenians (c. 441 B.C.E.). Philosophically, he clearly defends Parmenidean monism, although he does differ from Parmenides on at least two counts: the temporality of what-is, and whether or not what-is is unlimited or limited. He also differs from Zeno by laying out a clear thesis defending the unity of being.

Melissus sets out a system of concomitant and sequential arguments. First, what-is, or being, cannot have come from nothing. Nor could being have come to be from what-is, because this would mean that being already was. Likewise, and perhaps inversely to the first principle, being cannot become non-being. It therefore cannot perish. So, being, or what-is, is everlasting. Next, since it is everlasting—it does not come to be or perish—it has no limits set upon it, and so it is unlimited (apeiron). From this, we can see that being is one. If it were two or more, then each would be limited by the other. This leads us to see that what-is must be the same as itself, and therefore cannot be subject to the throes and flux of rearranging, pain, or any other sort of passion. Closely related to this, what-is must be motionless, since motion is a type of change. Similarly, there is no void, since the void would be nothing. This is another reason why what-is cannot move. To move, there must be emptiness or void, but since void cannot exist, we are left with fullness, that is, being is a plenum (Graham 467).

What is Melissus’ answer to the objection that we clearly observe with our senses flux and change in the world? He claims that there is only one thing that follows from his thesis. If there really is earth, fire, different types of metals, and so forth, then they must be like the one or what-is—they must each be as we first perceive them to be, for example, this here is fire, and that there is earth, and nothing else. However, when we think we see something hot becoming cold, then we simply have not observed correctly. “For they would not change if they were real, but they would remain just such as each appeared to be” (F8). Melissus does not explain what it is about our observation that goes awry. How is it that we make mistakes like thinking that we have observed a metal corroding? Melissus has no satisfactory answer to this question. If, he says, we observe correctly—if what we observe is real—it cannot change. He wants to hang on to an idea of reality where the elements, at least, remain. If we see fire, then there is always fire, despite this particular blaze burning out. Although this or that fire may be extinguished, fire is not extinguished.

7. Philosophies of Mixture

Anaxagoras and Empedocles are alike in at least two ways: first, they adhere to the Eleatic principle that being is necessary, that is, it is impossible for being not to be; second, and related to this Eleatic principle, being cannot be generated, nor can it perish, and thus all being is a continual process of mixture and separation.

a. Anaxagoras

Anaxagoras of Clazomenae (c. 500-c. 428 B.C.E.) had what was, up until that time, the most unique perspective on the nature of matter and the causes of its generation and corruption. Closely predating Plato (Anaxagoras died around the time that Plato was born), Anaxagoras left his impression upon Plato and Aristotle, although they were both ultimately dissatisfied with his cosmology (Graham 309-313). He seems to have been almost exclusively concerned with cosmology and the true nature of all that is around us. In fact, some ancient authorities have even called him an atheist (Graham 305). This might be due to his purely naturalistic explanations of the world. He thought, for instance, that the sun, moon, and other heavenly bodies were fiery stones rather than divinities (Graham 297). He is also thought to have explained—more or less correctly—the phenomenon of hail (Graham 303). As we shall see, Anaxagoras called upon his senses to do their work, but also his mind to look beyond what could be seen into the causes for all things.

Before the cosmos was as it is now, it was nothing but a great mixture—everything was in everything. The mixture was so thoroughgoing that no part of it was recognizable, due to the smallness of each thing, and not even any colors were perceptible. He considered matter to be infinitely divisible. That is, because it is impossible for being not to be, there is never a smallest part, but there is always a smaller. If the parts of the great mixture were not infinitely divisible, then we would be left with a smallest part. Since the smallest part could not become smaller, any attempt at dividing it again would presumably obliterate it. The infinitely divisible parts seem to have at least been mixtures of elemental or basic stuffs—earth, wet and dry, hot and cold, and “seeds” (sperma). The nature of these seeds is unclear. They might have been simply the germ of generation or small bits of elemental things. At any rate, these seeds and all other things were mixed together prior to separation (F1-F5).

The separation of the thoroughgoing mixture was generated by a high-speed centrifugal spin (F7). It was the force and speed of the spinning that caused the separating off of each being from the other. However, this separation was not a complete purification or isolation of parts. In fact, beings in the world as we know it, says Anaxagoras, are still mixtures (F8). Everything is still in everything. The difference is that the separating force generated recognizable and individuated beings. So why, then, does gold appear to us as gold and not, say, bone, since everything is in everything? A gold coin is considered to be gold because it has more gold than anything else. The predominant bits, in other words, make up the being as we know it (McKirahan 213). The question of how something small, like a gold coin, could ever hold bits of everything in it goes unanswered in our existing information of Anaxagoras.

The processes of mixture and separation are unceasing. Generation, says Anaxagoras, is mixing, and what appears to be perishing is really separation (F11). This has profound implications for what we consider to be human mortality. Under Anaxagoras’ cooperating principles of mixture and separation, what appears to be change into non-being (death) is impossible. We might surmise that what we call death is nothing more than a separation of these parts (this particular human body) and a mixture back into those parts (the earth). Likewise, a birth cannot be a creation out of nothing. The birth began as a mixture of seeds, which themselves were presumably already mixtures of other things. What comes to be cannot come from what is not. So, generation relies upon what already is. The Anaxagorean world, then, is a continuous play of being. Like the Eleatics, Anaxagoras relies upon the idea that what-is cannot possibly not be, that is, being is necessary. Also like the Eleatics, the senses, for Anaxagoras, do not give us an exhaustively accurate picture of reality—we must rely upon reason to make sense of the world. The difference between Anaxagoras and the Eleatics, however, is that Anaxagoras allows for change and natural processes to take place, without reducing these processes to sensory illusions.

There is one important player in this continuous play of being yet to be mentioned: mind (nous). Although mind can be in some things, nothing else can be in it—mind is unmixed. We recall that, for Anaxagoras, everything is mixed with everything. There is some portion of everything in anything that we identify. Thus, if anything at all were mixed with mind, then everything would be mixed with mind. This mixture would obstruct mind’s ability to rule all else. Mind is in control, and is responsible for having started the spinning of the great mixture, such that individual beings were generated in the process of separation. Everlasting mind—the most pure and fine of all things—is responsible for ordering the world. Thus, Anaxagoras’ world is not a chaotic process of mixture and separation; rather, the processes of mixture and separation are ordered by mind, which is unmixed.

Anaxagoras left his mark on the thought of both Plato and Aristotle, whose critiques of Anaxagoras are similar. In Plato’s Phaedo, Socrates recounts in brief his intellectual history, citing his excitement over his discovery of Anaxagoras’ thought. He was most excited about mind as an ultimate cause of all. Yet, Socrates complains, Anaxagoras made very little use of mind to explain what was best for each of the heavenly bodies in their motions, or the good of anything else. That is, Socrates seems to have wanted some explanation as to why it is good for all things to be as they are (Graham 309-311). Aristotle, too, complains that Anaxagoras makes only minimal use of his principle of mind. It becomes, as it were, a deus ex machina, that is, whenever Anaxagoras was unable to give any other explanation for the cause of a given event, he fell back upon mind (Graham 311-313). It is possible, as always, that both Plato and Aristotle resort here to a straw man of sorts in order to advance their own positions. Indeed, we have seen that Anaxagoras’ principle of mind set the great mixture into motion, and then ordered the cosmos as we know it. This is no insignificant feat.

b. Empedocles

We have an extensive poem from Empedocles of Acragas (near Sicily). He lived from 495-435 B.C.E., overlapping with Anaxagoras and Socrates. Much legend surrounds his life, and it is of course difficult to distinguish fact from fiction. He was a philosophical adherent to a Parmenidean principle of being, that is, what-is cannot not be (Graham 333). Politically, he was an advocate for democracy (Graham 335). Religiously, he seems to have been a Pythagorean, advocating a particular diet (F146-147) and endorsing the doctrine of the transmigration of souls (F124). He was reportedly a physician with a penchant for magic and prophecy. He was supposed to have kept alive a woman who neither breathed nor ate for thirty days (Graham 333). He was reportedly a self-proclaimed god, wearing purple robes, bronze shoes, and a gold wreath. To show his divinity, we are told that he leapt into the volcano at Mount Etna, purifying himself of his body (Graham 337). Legend notwithstanding, we have a substantial amount of his poetry, even if it is at times cryptic.

i. Macrocosm

At its most basic, the cosmos consists of a total of four elements or “roots,” plus two forces that are responsible for combining and separating these elements (F9). Empedocles was the first to name the four elements as earth, air, fire, and water. Love is the force that brings these elements, and the things generated from them, together, while Strife rends them (Graham 347). Empedocles, in Fragment 20 for example, repeatedly refers to a ceaseless cycle of unity from plurality (a movement of Love), and plurality from unity (a movement of Strife). While the names Empedocles uses for these forces might seem to us to carry moral overtones (Love as good and Strife as bad), they appear to be morally neutral for Empedocles—Love and Strife are simply the natural forces that guide the ceaseless motion of being.

Reminiscent of Anxagoras’ mixture, Love holds all things together in perfect unity when it reigns supreme. As Strife begins to hold sway, the unity is pulled apart, presumably producing the sorts of singular beings we see all around us now. Empedocles makes clear, however, that these cycles are not cycles of production out of nothing or perishing into nothing (F11). What-is, or being, never ceases to be, and something cannot come from nothing, nor can anything utterly perish into nothing. Human beings are simply mistaken when we claim that this is how the world works. Empedocles claims to employ the language of birth and death only as a matter of convention, recognizing that the truth is always at hand (F12). Love and Strife are not only responsible for the unification and pluralization of the elements and all things, but they are at play in the world as we know it now. Through an everlasting play of alteration, some things are repelled by one another through Strife, and others are brought together through Love. Some things are fitted for blending, and others are prone to separation (F23). Empedocles likens this to painters mixing colors, some more and some less, in order to create a painting (F24). Likewise, Love and Strife (the painters) bring together and pull apart the primeval elements. Everything that was, is, and will be owes its being to the play of Love and Strife.

How the process of mixture and separation happens is unclear. Empedocles tells us that there is a vortex. When Love is at the middle of the vortex, all things are unified—all things come from their respective places to join together in Love. All the while, Strife is retreating to the outside of the vortex. When Strife gains the strength to do its work, then there is the separation of the elements. First air (or aether) was separated off, then fire, earth separated off next, and then water gushed forth from earth as a result of the pressure of the heavenly rotations. When everything is in complete separation, nothing of our world is recognizable. We are, presumably, now living in a world wherein Love and Strife are both at work, with neither one dominating (McKirahan 269-270).

ii. Microcosm

Despite the predominance of his macrocosmic metaphysics in the surviving works and fragments, Empedocles did reason on the microcosmic and physical level. Different kinds of flesh seem to have been generated from different blends of the four elements (Graham 381). For human beings, perception and intelligence are keener in those whose elements are mixed more equally. Perception and intelligence, in fact, seem proportionate to one another—the more perceptive a being, the more intelligent the being (Graham 403). Moreover, thought seems to be a function of blood circulation, and Empedocles identifies the area around the heart as the area for thought (F115). There may also be a connection here with his theory of respiration. Inhalation occurs when blood retreats from tubes (presumably in the nose) and air fills those tubes. Blood rushing back into the tubes forces the air out (F78). Perception itself seems to occur when certain “effluences” from the perceived thing flow through the medium (air or water, for example) and into the pores of the sense organs. One sense cannot sense the object of another sense because the size and nature of the pores will not allow it. For example, the eyes seem to contain light or fire, and let in a certain amount of light. The ears, however, receive sound when the air outside moves and strikes the inner ear causing an echo (Graham 401).

Sometimes Empedocles describes himself as a fugitive from the gods (F8), and sometimes as himself a god who speaks the truth: “When I come to other flourishing cities I am revered by them, men and women alike” (F120). And again, “But why do I urge these things, as if doing some great deed, if I am superior to mortal men who perish many times?” (F121). At times, it seems he is a fallen god, and that humankind is for the most part fallen from divinity (F8). He seems to have advocated some type of transmigration of souls (F124), with reincarnation being based upon the purity of one’s life (Graham 415). Whether or not dietary and spiritual purity will result in salvation or re-divination is unclear. It does seem, however, that physical and spiritual purity, and intellectual prowess brings us closest to divinity. After a “fast from wickedness” (F150), “they become prophets, singers of hymns, physicians, and leaders among men on earth; afterwards they blossom as gods foremost in honors” (F153). Through it all, Love and Strife dominate and dictate the cycles of being.

8. The Atomists

Ancient atomism began a legacy in philosophical and scientific thought, and this legacy was revived and significantly evolved in modern philosophy. In contemporary times, the atom is not the smallest particle. Etymologically, however, atomos is that which is uncut or indivisible. The ancient atomists, Leucippus and Democritus (c. 5th century B.C.E.), were concerned with the smallest particles in nature that make up reality—particles that are both indivisible and invisible. They were to some degree responding to Parmenides and Zeno by indicating atoms as indivisible sources of motion, while Parmenides and Zeno considered the world to be indivisible and motionless. Since we have very little from his teacher, Leucippus, the focus here will be on Democritus’ thought.

a. Ontology

Despite the fact that Democritus was supposed to have been a prolific writer (we are told that he wrote approximately seventy books), we now have very little of his writing on atomism (Graham 521-525). What seems clear, however, is that Democritus thought that reality is made up of the full and the empty (void). The full is what-is, and the void is what-is-not (F4). Curiously, however, Democritus said that what-is is no more than what-is-not, that is, they have the same ontological status—each is as real as the other. We might interpret this, along with Aristotle, as meaning that there is body (the full), and there is void, and neither has any higher degree of being than the other. That the void is, is as much an ontological fact as the being of the plenum (Graham 525).

Atoms—the most compact and the only indivisible bodies in nature—are infinite in number, and they constantly move through an infinite void. In fact, motion would be impossible, says Democritus, without the void. If there were no void, the atoms would have nothing through which to move. Atoms take on a variety, perhaps an infinite variety, of shapes. Some are round, others are hooked, and yet others are jagged. They often collide with one another, and often bounce off of one another. Sometimes, though, the shapes of the colliding atoms are amenable to one another, and they come together to form the matter that we identify as the sensible world (F5). This combination, too, would be impossible without the void. Atoms need a background (emptiness) out of which they are able to combine (Graham 531). Atoms then stay together until some larger environmental force breaks them apart, at which point they resume their constant motion (F5). Why certain atoms come together to form a world seems up to chance, and yet many worlds have been, are, and will be formed by atomic collision and coalescence (Graham 551). Once a world is formed, however, all things happen by necessity—the causal laws of nature dictate the course of the natural world (Graham 551-553).

Figure, order, and position (or orientation) serve as the basic marks of distinction among atoms and the things that are (F4). Leucippus and Democritus seem to have identified these distinguishing marks as contour, contact, and turning (or rotation), respectively. These three determine which atoms combine to form elemental bodies like fire and water. It is important to note, however, that atoms themselves are immutable. The sensible world is generated from their combination, and things perish when some force causes the dispersal of the atoms.

b. Perception and Epistemology

Atoms are also responsible for sense perception and thought. Atoms of particular shapes are responsible for particular tastes, for example, round atoms are responsible for sweet tastes, while sour flavors consist of rough and angled atoms (Graham 581). Touch works similarly. Sight, hearing, and smell, however, are in some sense reducible to touch. Sensed objects always have effluences (Graham 585). We can see a tree, for example, because the tree’s atomic form somehow flows from it and makes contact with the atoms making up the eye, and the image of the tree is therefore carried into the eye. This might raise the problem of how effluences from large objects (for example, buildings) can fit into an object as small as the eye, but it could be that the effluences are somehow condensed before entering the eye (McKirahan 332). Democritus’ view of perception has important consequences for his epistemology.

If what we perceive are effluences of things, we do not perceive the things themselves; thus, we cannot know things as they are in themselves, but only as they appear to us (Graham 624). The truth is that there are atoms and void, all else is opinion and convention. It was said above that certain types of atoms are responsible for certain types of tastes, but even here convention and relativity have the final word. When certain atoms from certain objects come into contact with the atoms of different perceivers, what is sweet to one person might taste bitter to another. “By convention bitter, by convention hot, by convention cold, by convention color, but in reality atoms and void” (F32a). More precisely, we thoroughly understand very little, “but we perceive what changes in relation to the disposition of the body as things enter or resist” (F33). Even the human soul is a certain configuration and balance of atoms, and the best we can do is think, even if we cannot know much. In this way, Democritus is seen to be influential for Skepticism (Graham 516), but he is not a thoroughgoing skeptic since he claims that atoms and void can be known.

c. Ethics

While we have scant direct access to Democritus’ physical theory, we have an abundance of his own words regarding ethics. Most of his ethical thought comes to us in pithy aphorisms, with a central theme of contentment and freedom from disturbance. Well-being is founded upon contentment and being undisturbed, and these are attained by doing what is truly beneficial for oneself (Graham 633-635). The measure of what is beneficial is pleasure and pain, or joy and sorrow (F150b). It is clear, however, that Democritus does not condone sensual hedonism. In other words, there seems to be a loftier standard for what counts as pleasurable or joyful. “Those who get pleasure from the belly, when they exceed what is appropriate in food, drink, or sex, all find their pleasures are brief and short-lived, lasting only as long as they are eating or drinking, and their pains many” (F149). Constantly and excessively seeking pleasure in the flesh leads only to pain. By contrast, “reason is accustomed to take joys from itself” (F154). So, it is intellectual pleasure that is truly beneficial, and is the best measure of the best sort of life.

Reminiscent of Heraclitus, Democritus says that the best sort of person sees greater value in thinking than in polymathy (F203), and greater value in good action than in words about goodness (F267-F268). Fools leave things to chance (F105), while the wise person thinks, learns, and plans according to intelligence (F93). Interestingly, there is here a juncture of Democritus’ physical thought and his ethics. If the soul is a configuration of atoms, then teaching, learning, thought, and wisdom can help to refigure the soul and free us from the tyranny of chance (Vlastos 55-57). Pleasure and pain figure significantly into Democritean ethics, but it is pleasure of a higher sort that is constitutive of a good life. Reigning in one’s desires is not sufficient for the best sort of life. “Goodness is not just avoiding wrongdoing but avoiding even the desire for it” (F83). Seeking sensual pleasures leads to a disordered and painful life, while seeking the pleasures of wisdom and understanding furnish us with a harmonious and cheerful life.

9. Diogenes of Apollonia

Scholars do not know about Diogenes’ life. He might have been active in the middle or late fifth century (McKirahan 346). We do know, however, that he resurrected material monism. Like Anaximenes, he posited air as the primary element. Unlike the records and fragments that we have of Anaximenes, Diogenes makes explicit the reason why there must be an essential and common element. “My view, in general, is that all existing things are altered from the same thing and are the same thing” (F2). Evidently, based upon the purported introduction to his text—assuming that what was just quoted immediately succeeds the introduction—Diogenes takes this to be an indisputable starting part (F1). If everything in the cosmos were different, having no nature in common, then nothing would be able to mix with anything else, for example, no plant would be able to grow from the earth. Thus, apparent difference in being is only a variation on the same type of being. The whole cosmos is a constant alteration of one being.

Why must this common or basic being be air? Animals, including human beings, cannot live without respiration, that is, air is essential for life. Following a traditional view, Diogenes considers air to be the soul or life of animals. When respiration ceases, life (the soul) leaves the body (F6). Soul, life, and air are treated synonymously in this context (F10). Moreover, air is also responsible for intelligence (F5). Again, when one ceases to breathe, one is no longer intelligent. As intelligence, air “steers and controls all things;” therefore, air seems to be “God, and to reach everywhere, to arrange all things, and to be present in everything” (F5). Everything partakes of air, but nothing partakes of air in quite the same way, “but air itself and intelligence have many forms” (F5). Sometimes air is “warmer or colder, drier or moister, and more stationary or more lively in motion, and many other differentiations are present in it, including countless differentiations of flavor and color” (F5). The differentiations of air range from the most obvious, to those so subtle we can scarcely imagine.

Diogenes tells us that no two differentiated things can become exactly like one another without becoming the same. “Nothing…of those things that are differentiated one from another is able to become exactly like the other without becoming the same” (F5). In other words, no two things can be identical and simultaneously be distinct from one another. If two or more things are identical, then they are not distinct, but the same thing, and we have no way of distinguishing between them. There are many differences among beings in the cosmos; yet, the underlying nature remains the same. This allows for varying degrees of life and intelligence among beings. Therefore, there is no reason to lump Diogenes in with the traditional and shortsighted view that only human beings have intelligence. Other beings might have intelligence as well, but to varying degrees. Air allows for the eternal being of the cosmos, the differentiation and intelligence of all things.

10. The Sophists and Anonymous Sophistic Texts

As with the terms “cynic” and “stoic,” our modern usage of “sophistry” comes to us from a school of thought, which took its course in approximately fifth century Greece. Again, as with “cynic” and “stoic,” the current connotations of “sophistry” are not without their roots in the historical group of thinkers called Sophists. Yet, as we cannot reduce the thought of the Cynics and Stoics to mere cynicism or apathy, we cannot reduce the thought of the Sophists to mere sophistry. As we have seen, it has been tempting to read the Presocratics through the lens of Plato’s and Aristotle’s thought, and this is no less the case with the Sophists. In fact, two of Plato’s dialogues are named after Sophists, Protagoras and Gorgias, and one is called simply, The Sophist. Beyond this, typical themes of Sophistic thought often make their way into Plato’s work, not the least of which are the similarities between Socrates and the Sophists (an issue explicitly addressed in the Apology and elsewhere). Thus, the Sophists had no small influence on fifth century Greece and Greek thought.

Broadly, the Sophists were a group of itinerant teachers who charged fees to teach on a variety of subjects, with rhetoric as the preeminent subject in their curriculum. A common characteristic among many, but perhaps not all, Sophists seems to have been an emphasis upon arguing for both opposing sides of a case. Thus, these argumentative and rhetorical skills could be useful in law courts and political contexts. However, these sorts of skills also tended to earn many Sophists their reputation as moral and epistemological relativists, which for some was tantamount to intellectual fraud.

a. Protagoras

One of the earliest and most famous Sophists was Protagoras (c. 490-c. 420 B.C.E.). Only a handful of fragments of his thought exist, and the bulk of the remaining information about him found in Plato’s dialogues should be read cautiously. He is most famous for the apparently relativistic statement that human beings are “the measure of all things, of things that are that they are, of things that are not that they are not” (F1b). Plato, at least for the purposes of the Protagoras, reads individual relativism out of this statement. For example, if the pool of water feels cold to Henry, then it is in fact cold for Henry, while it might appear warm, and therefore be warm for Jennifer. This example portrays perceptual relativism, but the same could go for ethics as well, that is, if X seems good to Henry, then X is good for him, but it might be bad in Jennifer’s judgment. The problem with this view, however, is that if all things are relative to the observer/judge, then the idea that all things are relative is itself relative to the person who asserts it. The idea of communication is then rendered incoherent.

On the other hand, Protagoras’ statement could be interpreted as species-relative. That is, the question of whether and how things are, and whether and how things are not, is a question that has meaning (ostensibly) only for human beings. Thus, all knowledge is relative to us as human beings, and therefore limited by our being and our capabilities. This reading seems to square with the other of Protagoras’ most famous statements: “Concerning the gods, I cannot ascertain whether they exist or whether they do not, or what form they have; for there are many obstacles to knowing, including the obscurity of the question and the brevity of human life” (F3). It is implied here that knowledge is possible, but that it is difficult to attain, and that it is impossible to attain when the question is whether or not the gods exist. We can also see here that human finitude is a limit not only upon human life but also upon knowledge. Thus, if there is knowledge, it is for human beings, but it is obscure and fragile.

b. Gorgias

Not far behind Protagoras was Gorgias (c. 485-c. 380 B.C.E.). Perhaps flashier than Protagoras when it came to rhetoric and speech making, Gorgias is known for his sophisticated and poetic style. He is known also for extemporaneous speeches, taking audience suggestions for possible topics upon which he would speak at length. His most well-known work is On Nature, Or On What-Is-Not wherein he, contrary to Eleatic philosophy, sets out to show that neither being nor non-being is, and that even if there were anything, it could be neither known nor spoken. It is unclear whether this work was in jest or in earnest. If it is the former, then it was likely an exercise in argumentation as much as it was a gibe at the Eleatics. If it was in earnest, then Gorgias could be seen as an advocate for extreme skepticism, relativism, or perhaps even nihilism (Graham 725).

On Nature can be summarized as follows. If there is anything, it is either exclusively what-is or what-is-not, or both what-is and what-is-not are. Gorgias then eliminates each of these possibilities, beginning with what-is-not (non-being). If non-being were, then there is a contradiction—it would simultaneously both be and not be. Moreover, if non-being were, then being (what-is) would not be, but then non-being would have the property of being, and being would have the property of non-being, which is absurd. Neither, however, is there being (what-is). If being were, it would have to be everlasting, generated, or both. If it were everlasting, it must have always been, and thus would be unlimited. But if it were unlimited, it would not exist anywhere, since for anything to be, it must be in some place, and this place must be different from that which is in it. Being cannot be generated, because if it were, it would have come to be from something that is (being), or from something that is not (non-being). If the former were the case, the being already was and did not need to be generated. If the latter case, then non-being would have caused being, which would be absurd. Finally, being both everlasting and generated would be a contradiction, since if it were everlasting, it could not have been generated, and vice versa (Graham 741-743).

Moreover, even if there were anything, then it could not be thought or known. In order for being to be the object of thought, then being must be, because if there is no being to be thought, it cannot be thought (Graham 743). Yet, if objects of thought were the same as what-is, then whatever we happen to think (unicorns, centaurs, and so forth) would be, but this is absurd (Graham 745). In addition, if objects of thought were things that are, then we would not be able to think of anything that is not, but since we can think of things that are not (unicorns, centaurs, and so forth), objects of thought cannot be tantamount to things that are.

Finally, even if we could think what-is, we would not be able to communicate it. We perceive objects that are different from us, for example, a table, a song, or a scent. We perceive these things by the respective senses, that is, sight, sound, and smell. We communicate by speech, but speech is not the same thing as what is perceived. “That by which we communicate is speech, but speech is not the subsisting and existing things themselves” (Graham 745). Thus, when we talk about the table, the song, or a particular scent, we do not communicate those very things to each other, but rather we communicate words. Just as, therefore, a sight cannot become a sound and vice versa, a perceived thing cannot become speech and vice versa. Again, whether this was all a mere jocular exercise in argumentation or an earnest stab at truth is unknown. If, however, it was the latter, then we seem to be left speechless in a world that is impossible to understand.

c. Antiphon

Very little is known about Antiphon the Sophist. He seems to have been known for courtroom speeches, dream interpretation, and claiming to heal depression (Graham 789). His views on justice and law are perhaps most salient in the extant fragments. Justice amounts to obeying the laws of the city in which one is a resident, but doing so only when others are present to witness it. When alone, it is better to value “the works of nature. The works of law are factitious, whereas those of nature are necessary” (F46a). The debate between law/custom (nomos) and nature (phusis) was a central theme of philosophical and sophistic thought in ancient Greece. To what degree is law natural? Is morality simply law and custom, or is it natural? Antiphon set law in opposition to nature, although it is unclear what he means by “the works of nature.” Antiphon could be interpreted as an advocate for hedonism. Indeed, things that bring pleasure, he claims, are truly advantageous and beneficial, thus following the course of nature. Things that bring pain, on the other hand, are not advantageous (F46a).

If we do read Antiphon as a hedonist, it would have to be a tempered hedonism that distinguishes between good and bad pleasures. He belittles the pleasures of sexual intercourse, claiming that such pleasures “do not travel alone, but in the company of sorrows and pains” (F51). He also looks with a critical eye towards money making, warning against miserliness. He recounts the story of a man whose hidden store of money was stolen. “His friend told him not to worry, but to put a stone in the same place where the money had been and imagine that he still had the money and he had not lost it. ‘For even when you had it, you did not use it at all; hence, do not feel deprived of it even now’” (F57). The lesson here seems to be that if one is going to make money, then one should use that money, for money stored away becomes superfluous. He also warns against doing evil to one’s neighbor, since this will necessarily incur evil for the perpetrator (F61). Moreover, “nothing is worse for men than a lack of discipline,” so we should raise our children well, and when they grow up, great changes will not overwhelm them (F64). So if we are to read Antiphon as a hedonist, then it is a hedonism that works towards what is truly advantageous for oneself—a hedonism tempered by practical wisdom.

d. Prodicus

Prodicus of Ceos (c. 465-c. 395 B.C.E.), like most Sophists, worked as teacher and rhetorician. Like Protagoras, he presented a challenge to theistic thinking, but took this challenge further. The Greeks and Egyptians tend to consider all beneficial things to be gods. “Sun, moon, rivers, springs, and in general everything that benefits our life the ancients considered gods on account of the benefit accruing from them, just as the Egyptians make the Nile a god…” (F3c). This, of course, is not enough evidence to suggest that Prodicus was an atheist (although that word was broader for the ancients than for us, referring to those who hold no belief in gods, and to those who hold unorthodox beliefs in the gods), but it certainly represents a challenge to common theistic notions that the gods are independent of our judgments about them.

Plato portrays Prodicus as a specialist in correct diction. In the Cratylus, Socrates says,

The study of words is not a minor undertaking. If I had heard Prodicus’ fifty-drachma lecture, which provides the student complete instruction on this subject, as he himself advertises, nothing would keep me from telling you straightaway the whole truth about correct diction. But alas I have not heard it, but only the one-drachma lecture. (Graham 847)

This humorous passage is typical of Plato’s emphasis on the Sophist’s method of charging large sums of money for instruction. In fact, in the Hippias Major Plato says of Prodicus that “it is amazing how much money he took in by putting on demonstrations and instructing the young men” (Graham 843). As Graham points out, however, “The ability to make fine discriminations of words is important to rhetoric, and we should remind ourselves that there were no dictionaries in the classical age, and treatises such as Prodicus wrote were the first essays in lexicography and diction” (860). Thus, while Plato treat Prodicus with more respect than other Sophists, we should be aware that his agenda is in part to contrast Prodicus with Socrates, who claimed to teach nothing and to charge nothing for his discussions (compare with the Apology), and that Prodicus’ thought might have been far more important that Plato considered it to be.

e. Anonymous Texts

Two anonymous texts called the Anonymous Iamblichi and the Dissoi Logoi represent different ends of the spectrum of sophistic thought. The Anonymous Iamblichi is primarily an ethical work, dealing with reputation, virtue, and law. It exhorts the audience toward an education in virtue from an early age, because “a long time’s familiarity with a thing at length strengthens the practice, while a short time is not able to accomplish this” (Graham 865). Such a life requires self-control, especially an indifference to money, “by which everyone is corrupted” (Graham 867). The love for money is, for most people, merely a symptom of their fear, that is, fear of death, disease, old age, and so forth. These things can presumably be held at bay, so the masses think, by money. Rivalries and competition with others are also motives for greed. Thus, law is needed to ensure that money remains a good for the entire community, and moreover so that the community does not fall into dissolution. Lawlessness and greed beget tyranny. Thus, virtue and law are intimately connected.

The Dissoi Logoi, or Twofold Arguments, is a sophistic exercise in arguing for the relativity of things like good and bad, right and wrong, the just and the unjust, truth and falsity, and so forth. What is good in one situation might be bad in another, or good for one person, but bad for another. For example, “sickness is bad for the sick, but good for the physicians. Further, death is bad for those who die, but good for undertakers and makers of tombs” (Graham 879). The relativity of right and wrong to cultural sensibilities is also emphasized. “For example, it is right among the Spartans for girls to exercise naked and appear in public in clothing without sleeves and blouses; but it is wrong to the Ionians” (Graham 883). Again, “Among the Thracians it is a mark of beauty for girls to have tattoos; for everyone else tattoos are a punishment for a crime” (Graham 883). The problem with cultural relativism is that, when taken to its extreme, we cannot claim that certain activities are universally wrong or right, but only wrong or right relative to each culture. Thus, we may see that the arguments in the text are generally bad, but we have no reason to believe that they were meant to be good. The Dissoi Logoi might be emblematic of sophistical exercises at the time, but not necessarily of the more sophisticated of the Sophists.

11. Conclusion

From Thales to the Sophists, we see much variation in thought, as well as in the style and presentation of those different ways of thinking. Yet, we also see common threads running throughout Presocratic thought. On one hand, there is a tendency to think of the cosmos on its own terms. This new way of thinking often takes its course away from the confines of traditional, theocentric thought. Yet, on the other hand, many of these thinkers reformulated and reconceived God, the gods, and divinity. There is also a push towards ethics and thinking about human affairs and the best sorts of ways for human beings to live. Behind it all—the backdrop, as it were—is a preference for free, rational thought.

12. References and Further Reading

The lists of primary and secondary sources are very abbreviated. The secondary sources are generally accessible for non-specialists, and a good starting place for further research into the Presocratics. Some of these books also have extensive lists of references for further reading.

a. Primary Sources

  • Diels, Hermann and Walther Kranz. Die Fragmente der Vorsokratiker: Griechisch und Deutsch. Berlin: Weidmannsche Buchhandlung, 1910. Print.
    • This is the first and most traditionally used collection of Presocratic fragments and testimonies. This edition has the fragments in Greek with German translations. The book is no longer in print, and while it is often still cited in most scholarship, it is not the work cited in this article.
  • Graham, Daniel W. The Texts of Early Greek Philosophy: The Complete Fragments and Selected Testimonies of the Major Presocratics. 2 vols. Cambridge: Cambridge University Press, 2010.
    • This is the first collection of the Presocratic fragments and testimonies published with the original Greek and its corresponding English translations. It is the work cited in this text. Graham offers a short commentary on the fragments, as well as references for further reading for each thinker. He has organized by topic the fragments for each thinker, and labels the fragments with an F, followed by the number of the fragment. That is how the fragments have been cited in this article. Testimonies, as well as Graham’s commentary, are cited by page numbers.

b. Secondary Sources

  • Barnes, Jonathan. The Presocratic Philosophers. London and New York: Routledge, 1982.
    • A classic work with interpretations of the Presocratics.
  • Burnet, John. Early Greek Philosophy. London: A&C. Black Ltd., 1930.
    • Another classic work with interpretations of the Presocratics.
  • Long, A.A. ed. The Cambridge Companion to Early Greek Philosophy. Cambridge: Cambridge University Press, 1999.
    • A collection of sixteen essays by some of the foremost scholars on Presocratic thought. The essays are generally accessible, but some are more appropriate for specialists in the field.
  • McKirahan, Richard D. Philosophy Before Socrates: An Introduction with Texts and Commentaries. Indianapolis: Hackett, 1994.
    • This is a very good book for non-specialists and specialists alike interested in further commentary on the Presocratics. The book contains most fragments for most thinkers and reasonable explanations and interpretations of each. There is also a helpful chapter at the end of the book on the nomos-phusis debate. The text includes a fairly extensive section for suggestions for further reading.
  • Vlastos, Gregory. “Ethics and Physics in Democritus.” Philosophical Review, vol. 2, 578-592, 1994.
    • This article is technical, but offers insight into the connection between Democritean physics and ethics, and was cited in the current overview.

 

Author Information

Jacob Graham
Email: jgraham@bridgewater.edu
Bridgewater College
U. S. A.

Armed Humanitarian Intervention

Humanitarian intervention is a use of military force to address extraordinary suffering of people, such as genocide or similar, large-scale violation of basic of human rights, where people’s suffering results from their own government’s actions or failures to act.  These interventions are also called “armed interventions,” or “armed humanitarian interventions,” or “humanitarian wars. They are interventions to protect, defend, or rescue other people from gross abuse attributable to their own government.  The armed intervention is conducted without the consent of the offending nation. Those intervening militarily are one or more states, or international organizations.

The need to consider and understand the many issues involved in humanitarian interventions have been borne home by the fact that these interventions has become more complex and more common since the 1980s, and because of the consequences of non-intervention, such as in the Rwandan genocide of 1994, in which nearly one million people were killed in less than three months.  Humanitarian interventions raise many complex, inter-related issues of international law, international relations, political philosophy, and ethics.

This article considers moral issues of whether or when humanitarian intervention is justified, using just war theory as a framework. Section One addresses general characterizations of humanitarian interventions and commonly discussed cases, as well as some definitional or terminological issues. Section Two examines the question: What humanitarian emergencies rise to a level at which intervention is appropriate? Section Three presents just war theory as a common framework for justifying humanitarian interventions.    Section Four considers some other, related issues that may support or challenge armed interventions: international law, state sovereignty, the selectivity problem, political realism, post-colonialist and feminist critiques, and pacifism.

Table of Contents

  1. What is a Humanitarian Intervention?
  2. The Threshold Condition for Intervention
  3. Justifying Intervention: Just War Theory
    1. Justifying the Recourse to War (jus ad bellum) and Interventions
      1. Just Cause
      2. Right Intention and Right Authority
      3. Likelihood of Success, Last Resort, and Proportionality
    2. Justifying Conduct in War (jus in bello) and Justice after War (jus post bellum)
    3. Some Implications of Justifying Humanitarian Intervention
  4. Other Issues and Challenges
    1. International Law and Ethics
    2. State Sovereignty and Intervention
    3. The Problem of Selectivity
    4. Political Realism
    5. Post-Colonialism and Feminism
    6. Pacifism
  5. References and Further Reading

1. What is a Humanitarian Intervention?

The term ‘humanitarian intervention’ came into common use during the 1990s to describe the use of military force by states or international organizations in response to genocides, “ethnic cleansing,” and other horrors suffered by peoples at the hands of their own governments.  But cases of armed interventions are not new.  Several times during the nineteenth century European powers intervened militarily in various provinces of the Ottoman Empire to protect Christian enclaves from massacre or oppression (Bass). Following World War II there were many military interventions sometimes dubiously described as ‘humanitarian’, including by the United States in Latin America and France’s 1979 use of military force in its former colony, the Central African Republic.  Other cases remain notable foci of scholarly discussion:  India’s 1971 military intervention in East Pakistan, now Bangladesh; Vietnam’s 1979 intervention into Cambodia; and in the same year, Tanzania’s intervention into Uganda.  Later cases include uses of military force to protect Iraqi Kurds, and interventions in Somalia, Haiti, Liberia, and Sierra Leone, among many others.  The 1994 genocide in Rwanda focused attention on the consequences of failing to intgervening, because external military force was not deployed to prevent the killing of nearly 1 million people in just three months of violence.

Philosophic attention to humanitarian interventions is not new.  The seventeenth century jurist, Hugo Grotius, is credited with originating the modern conception of armed humanitarian intervention.  In his classic work of 1646, The Law of War and Peace, he includes an entire chapter, “On Undertaking War on Behalf of Others,” and writes:

If a tyrant … practices atrocities towards his subjects, which no just man can approve, the right of human social connection is not cut off in such a case …. It would not follow that others may not take up arms for them.

Some argue that the earlier “just war” tradition’s appeals to natural law, in effect, permitted humanitarian interventions.   Classic theorists like St. Augustine, Thomas Aquinas, and Vitoria saw a just war as aimed at the justice of punishing wrongdoing by other political leaders, which, some argue, would permit intervening against governments’ mistreating their own people (Johnson).  In the nineteenth century John Stuart Mill appealed to the importance of communal self-determination in providing consequentialist arguments limiting armed interventions.  In the early 21st century, Michael Walzer entertained armed interventions as justified responses to acts “that shock the moral conscience of mankind” (Just and Unjust Wars, 107).

A humanitarian intervention is a form of foreign interventionism using military force.  Consider this paradigm characterization of humanitarian interventions as:

the threat or use of force across state borders by a state (or group of states) aimed at preventing or ending widespread and grave violations of the fundamental human rights of individuals others than its own citizens, without the permission of the state within whose territory force is applied. (Holzgrefe, 18)

Humanitarian interventions are distinguished from other forms of interfering with another state’s activities, such as humanitarian aid, sanctions of various kinds, altering of diplomatic relationships, monitoring arms treaties or elections of human rights practices, and peace-keeping.  A humanitarian intervention does not require the consent of the target state: it is a form of coercion.  The government is deemed culpable in the suffering of others that is to be prevented or ended.  Those suffering and the target of the rescue effort are not nationals of the intervening states:  humanitarian interventions are, as Nicholas Wheeler puts it, about “saving strangers.”  Definitions are typically neutral as to whether the intervention is unilateral or multi-lateral and as to whether it is authorized (for example, by the United Nations) or unauthorized.  Finally, the interveners’ purpose is rescue, defense, or protection of those who are suffering due to their own government’s actions or failures. The purpose is not conquest, territorial control, support of insurrectionist or secessionist movements, regime change, or constitutional change of government.

Humanitarian intervention vary in terms of motivations of a state in using military force.  Some stricter definitions require a purity or primacy of intention in the use of armed force:  militarily addressing the suffering of others for reasons of national interest, then, by definition, are not humanitarian interventions.  Other definitions attend more to the effects of intervening than to motivations.  These definitional disputes involve evaluating actions on behalf of others.  The issue, then, may be more a matter of how much normative work is to be done by definition rather than by a separable ethical judgment of the actions themselves.  A deontologist like Kant or Aquinas, for example, might maintain that genuine instances of a morally worthy act require a purely humanitarian intention, while a utilitarian like Mill might insist that the motive matters not at all to what the act is or to the act’s morality, but only to our judgment of the actor. Such definitional issues also intersect with doctrines of political realism as explanation of states’ behavior (In IEP, see “Political Realism” and “Interventionism” (sec 3b).).  If all state action is explained by national self-interest, typically understood in terms of national security, military or economic power, or material well-being, then, all states’ actions are necessarily motivated by self-interest, actions motivated solely or primarily by humanitarian considerations are precluded, and there are, by these stricter definitions, no genuine examples of humanitarian interventions (see IV.d below).

Another terminological consideration is reflected in the work of the 2001 International Commission on Intervention and State Sovereignty (ICISS), The Responsibility to Protect.  The Report title is a preferred term because it avoids militarizing what is a humanitarian action and it avoids the connoted approval of military action by labeling it ‘humanitarian’ (sec. 1.39-1.41).  Indeed, these semantic concerns are grounded ultimately in re-conceiving state sovereignty not as a right not to be transgressed by outsiders, but as a duty to protect the people of a state and, if needed, people of other states (sec. 1.35).  Many differences of definition about what constitutes a humanitarian intervention reflect varying views about the normative merits and justifications for using force to address the suffering of others at the hands of their own government.

2. The Threshold Condition for Intervention

Even proponents of humanitarian intervention advocate very limited circumstances where such uses of military force are justifiable.  In particular, proponents attempt to specify minimum, threshold conditions in terms of the severity, scale, and kinds of human suffering necessary (but not sufficient) to justify intervention.  For example, seeing interventions as rescues, Michael Walzer specifies situations of “massive violations of human rights” where “what is at stake is the bare survival or the minimal liberty” of a people (Just and Unjust Wars, 101).  The ICISS Report, The Responsibility to Protect, specifies a threshold condition in terms of “large scale loss of life, with genocidal intent or not,” or “large scale ‘ethnic cleansing’, … whether carried out by killing, forced expulsion, acts of terror or rape” (sec 4.19).  In a similar vein, Nicholas Wheeler speaks of supreme humanitarian emergencies, where there are “extraordinary acts of killing and brutality” beyond the “abuse of human rights that tragically occurs on a daily basis” and that are of a magnitude and severity that “the only hope of saving lives depends on outsiders coming to the rescue” (34).  Common among specifications of threshold conditions are requirements that the most basic of human rights are being violated, that the human suffering is widespread and systematic, and that the government bears some culpability for what is happening to its people.  Interventions, then, are justifiable only to address the most egregious violations; the threshold conditions in the target state must be those that, as Walzer put it, “shock the conscience of mankind.”

Specifications of threshold conditions raise several issues.  A specification of the conditions of suffering will be inherently vague. How many rights violations or how many horrors or how extraordinary must the violations be in order to satisfy the threshold condition for armed intervention?  A second issue involves the invoked notion of basic human rights.  It is commonly held in Western ethical literature that all human rights are equally important (See “Human Rights” in IEP).  Attention to violations of basic human rights, however, presupposes a hierarchy of such rights for all humans. Allen Buchanan, for example, includes significant civil and political rights as well as “the right to resources for subsistence” in a list of basic human rights (Justice, 129); in The Law of Peoples, John Rawls identifies “a special class of urgent rights” that includes ethnic groups’ right to “security from mass murder and genocide” (79).   Some have argued that negative rights (for example, not to be tortured, not to be raped, not to be killed) are more basic and more important than positive rights (for example, to basic welfare requirements of food, clothing, shelter), though, following Henry Shue’s analysis in Basic Rights (Princeton, 1980), not all accept such a distinction or hierarchy.  There are disagreements about the extent to which basic human rights are the rights of individuals or may include rights of collectives, or “group rights,” such as rights to collective self-determination, group survival, and cultural integrity.  International human rights law introduces yet other hierarchies which may be relevant. By international treaty the right not to be tortured is uniquely absolute; only a few human rights are not to be derogated even if the nation’s survival is at stake, while during declared public emergencies a state can set aside other human rights (International Covenant on Civil and Political Rights, Article 4.1-4.2); and the legal obligations of states to respect civil and political rights are much stronger than for social, economic, or cultural rights (cf. International Covenant on Civil and Political Rights, Article 2, and International Covenant on Social, Economic, and Cultural Rights, Article 2).

Government culpability must satisfy certain threshold conditions.  Human suffering and rights violations, even when widespread and systematic, may be perpetrated by the government itself, as, for example, during the Holocaust in Nazi Germany; the late 1970s “killing fields” of Cambodia’s Khmer Rouge regime; or 1990s “ethnic cleansings” conducted by the Bosnian Serbs in the former Yugoslavia.  Or government may be complicit, indirectly fostering human rights violations by providing funding, arms, or logistical support to private militias, by coordinating attacks on people through control of the communication infrastructure, or by inciting action through propaganda and other forms of media control. This was the case in the Rwandan genocide of 1994 and during the violent campaigns by the janjaweed in Darfur, Sudan, beginning about 2004.  Or state involvement may be more akin to negligence, incompetence, or inability to govern. In inept or failed states, the government does not maintain effective control of territory and people. This often leads to  widespread violations of human rights by non-state actors. Somalia was a “failed state” in 1990.  In much of the country, people lived in fear of armed militias while the central government could not assert effective control.   The United States intervened militarily. The government culpability necessary to satisfy threshold conditions can range from perpetrator to failed state.  Furthermore, situations often mask or complicate whether threshold conditions are satisfied.  Widespread and systemic human suffering often occurs amidst or accompanying domestic insurrections, counter-insurgency campaigns, revolutions, liberation efforts, partition or secession battles, or civil wars, for example.  In Darfur, the government of the Sudan claims to have been conducting a counter-insurgency campaign; the Bosnian War can be seen as part of a secession or partition battle; the Rwandan genocide occurred in the context of a civil war and struggle for power in the country.  The challenges here are both epistemic and conceptual. Satisfying threshold conditions of suffering may depend on the specific domestic contexts in which people and government find themselves.

Though most discussions of humanitarian interventions specify threshold conditions in terms of human rights violations, other kinds of characterizations of the relevant human suffering are used by others.  There are justificatory implications for these kinds of differences.  A characterization in terms of human rights readily suggests deontological justifications for armed interventions.   For any genuine right, others are bound by correlative duties.  Even if the primary correlative duty for human rights falls on a national government, some argue that the correlative duties of others include obligations to respect, protect, and enforce human rights whenever the primary duty-holder fails to do so (these are sometimes called “default duties”).  Thus, armed interventions are justified partially as discharging (default) duties correlated with the human rights that are being violated in the target state.  On the other hand, some see armed interventions as aimed at reducing human suffering, regardless of whether there are violations of specific human rights.  As mentioned above, the ICISS Report specifies “large scale loss of life” or “large scale ‘ethnic cleansing’.”  Such a characterization of threshold conditions suggests a direct consequentialism at work.  Some feminists have argued that social oppression of women constitutes threshold conditions for forceful interventions (Cudd).  Uses of armed force, of course, have costs for human suffering, too.  The idea is that sometimes the use of deadly force is justifiable to save lives and reduce total human suffering.

Another development relevant to interventions is the concept of human security.  The notion of multi-faceted security, then, applies not only to states (as in “national security”), but also to people.  The concept of human security is defined broadly both in terms of causes and kinds of human suffering.  For example, the ICISS Report, The Responsibility to Protect, describes the security of people as

their physical safety, their economic and social well-being, respect for their dignity and worth as human beings, and the protection of their human rights and fundamental freedoms. (sec. 2.21)

Furthermore, valuing human security requires addressing “threats to life, health, livelihood, personal safety and human dignity” without regard to the sources of these threats, whether governmental, man-made, or natural.  The United Nations Development Program has adopted a similarly broad definition.  With respect to threshold conditions for humanitarian interventions, using the broad concept of human security has some advantages.  It eliminates the need to establish target state culpability for its peoples’ suffering; and, in fact, often some government action or inaction at least partially explains even famines, and  the effects of droughts or earthquakes.  Determining whether threshold conditions are satisfied is also simpler without the need to apply specific legal or moral categories such as basic human rights or genocide.  Furthermore, it is argued, the concept calls attention to preventing humanitarian emergencies from emerging, instead of focusing so much on armed interventions as reactions to emergencies.  But the breadth and scope of the concept also is challenging for use as a threshold condition for humanitarian interventions.  Virtually any kind of widespread, systematic suffering or threat to people becomes a security issue possibly addressed by an armed intervention: many situations around the world thereby satisfy a requisite condition for justifiable intervention.  The concept’s breadth erases or postpones justifying priorities, both by states trying to address their own peoples’ needs and by states or organizations readying to rescue those not secure under their own governments.  As a threshold condition, the breadth of the concept of human security only make more common problems with properly selecting which of many humanitarian emergencies warrant others’ use of armed force to alleviate human suffering (see IV. C. below).  For these and other reasons, the concept of human security is not often invoked in articulating threshold conditions for interventions.

3. Justifying Intervention: Just War Theory

The satisfaction of specified threshold conditions and state culpability requirements are only necessary conditions for morally justifying humanitarian interventions.  There is a paradoxical quality in using deadly force to prevent or end violence against others.  How can it be that war is warranted in the name of saving lives? A common response employs the “domestic analogy,” seeing states as analogous to persons.  As matter of morality and legality, individuals have rights of defense that permit using deadly force as proportionate response to unavoidable, imminent threats to our own lives or to the lives of others, whether the endangered people are kin, akin, or strangers.  By analogy, then, states have not only rights of self-defense if attacked, but rights to use deadly force in defense of others.  A second analogy also sees states as persons.  Given that individuals, in some circumstances, ought to perform beneficent acts such as “interposing to protect the defenseless against ill usage” and “saving a fellow-creature’s life,” to use John Stuart Mill’s phrasing, so states sometimes are right to rescue others being poorly treated under their own government.  More direct arguments see a connection between taking universal human rights seriously and acting rightly with deadly force when this force is necessary to defend or protect those rights.  Direct consequentialist arguments appeal to the morality of preventing extraordinary suffering when possible, that is, if and when there is opportunity and capability that is not more costly in its effects on human lives than not acting with deadly force.  Thus, there need not be inconsistency or paradox in saving lives by using armed force, at least in some grave circumstances.

Discussions of whether humanitarian interventions are justified take seriously both the moral pull of extreme humanitarian emergencies that “shock the conscience of mankind” and a moral reticence about using deadly force even to save lives. Regardless of the kind of moral theory employed – direct or indirect utilitarianism, natural law principles, or correlative duties of human rights, for example – justifying an armed intervention involves addressing a host of questions:  Who or what has the authority to intervene? Is an intervention likely to succeed, or be worth the costs, on balance?  Are there not non-military measures available to address the human suffering?  What exactly is the purpose of the military action and how are armed forces to conduct themselves in defending, protecting, or rescuing others from their own government?  Such questions are, in fact, paralleled in the structure of just war theory, or jus bellum, and its traditional duality:  jus ad bellum, or the conditions requisite for justifiably going to war, and jus in bello, the principles governing proper conduct of war.  Just war theory—especially jus ad bellum—is the framework for making moral decisions about humanitarian interventions.  For example, in Saving Strangers, Nicholas Wheeler says “requirements that an intervention must meet… are derived from the Just War tradition” (33-34).  The 2001 ICISS Report, The Responsibility to Protect, summarizes “criteria for military intervention… under the following six headings:  right authority, just cause, right intention, last resort, proportional means and reasonable prospects” (sec. 4.15-4.16ff.).  Michael Walzer defends interventions in his classic work, Just and Unjust Wars, and again prominently in the Preface to the Third Edition of that book.  Many critics challenge the suitability, adaptations, and implications of just war theory for humanitarian interventions. So, proponents, opponents, and cautionary discussants employ just war theory in exploring the moral merits of humanitarian interventions.

There are additional reasons for relying on the jus bellum framework.  Humanitarian interventions resemble wars, are even sometimes referred to as “humanitarian wars.”  Military force is used in another nation’s territory in order to rescue, protect, or defend people.  The most basic moral question of modern just war theory is delineating what states are permitted to do through the use of military force to those outside their borders and for achieving what aims or purposes.  Second, the classic just war tradition includes attention to what are now called humanitarian interventions, at least as far as the cause and purpose of such military action.  Morally justifying humanitarian interventions, then, is often explored by interpreting, applying or adapting the standards for judging whether going to war is justified; receiving the most attention are issues of just cause and right authority for interventions.  Other major facets of just war theory and its tradition – jus in bello, and jus post bellum – are also employed, though less prominently, as there has been much less philosophic attention to the conduct of interventions or what follows the use of armed force to rescue, protect, or defend others.

a. Justifying the Recourse to War (jus ad bellum) and Interventions

The jus ad bellum framework of just war theory identifies about a half dozen considerations relevant to justifying the recourse to war.  All the ad bellum requirements must be satisfied for war to be justified.  So, the use of armed force for humanitarian purposes is justified only if all six ad bellum requirements are satisfied.  Three of these considerations – last resort, likelihood of success, and proportionality – are consequentialist requirements.  Proportionality, for example, requires that the benefits of military action are not overshadowed by the inevitable costs, destruction, and other negative effects.  Likelihood of success involves estimating the consequences of waging war, specifically, the probability that the war’s aims will be accomplished.  Last resort captures the idea that war is worth its effects only if non-military means are not available for success: recourse to war is justifiable only if alternative, pacific courses of action will not achieve the morally acceptable aims of war (tied to a “just cause”).  The other three jus ad bellum considerations – just cause, right authority, right intention – appear to be deontological, rooted in natural law, for example, human rights, or other normative, non-consequentialist principles.  Pivotal among jus ad bellum considerations is the notion of “just cause” for war.  Adapting the jus bellum framework to humanitarian interventions brings a mixture of deontological and consequentialist reasoning to the issues, with the satisfactions of threshold conditions – a “just cause” – being central to justifying the use of armed force to address human suffering.

i. Just Cause

In the just war tradition, just cause has long been among the basic considerations in determining whether the recourse to military force is justified.  St. Thomas Aquinas’s famous first articulation prominently includes “just cause” as a requirement, as do virtually all subsequent contributors to the tradition.  The idea is that certain circumstances rightly prompt and contribute significantly to a justification for a war.  Furthermore, the just war tradition, just war theory, and international law today acknowledge that armed attack by another justifies going to war: wars of reactive self-defense clearly satisfy the “just cause” requirement.  As applied to humanitarian interventions, then, the issue is whether a “just cause” includes defense of others, or as many state it, whether threshold conditions for intervention are a “just cause” for a state or states using armed force to rescue, protect, or defend other people.

Supporters of justifiable interventions call attention to features of the just war tradition.  As noted  in sec. I, Hugo Grotius explicitly acknowledges that a government’s subjects suffering atrocities permits others “take up arms for them.”  A dominant theme of the classic just war tradition is that punishable wrongs are a just cause for war, even if the intervening party has not been wronged.  James Turner Johnson, for example, suggests that traditional just war theory is not based on a presumption against war, but on a presumption against injustice:  a just war is not only a justified war, it is a war waged for justice.  The interpretative contention, then, is that only in the early 21st century has just war thinking come to be so restrictive about “just cause” as to allow only for wars of self-defense.

Aside from interpretations of the just war tradition, a number of fundamental issues are at stake in debates about the substantive content of the “just cause” requirement as it pertains to humanitarian interventions.  One matter deals with the kind of moral foundations presupposed for just war theory itself. Some appeal to transnational ethical norms about rights or duties, whether expressed as universal natural law principles about rights of defense or duties correlated with universal human rights.  So, sometimes also coupled with the “domestic analogy” between persons and states described above, the ethical arguments turn on whether there is a natural law duty to rescue or render aid, a (default) duty to enforce human rights, or a transnational right to defend others conjoined to the uncontroversial self-defense right of states.  Discussions often challenge the adequacy of the analogies: states seem much unlike individuals when it comes to ethical norms.  Also, legal positivists, especially, find the appeal to natural law more than suspect, and positive international law is explicit only about states’ right to self-defense (see section IV.A. below).  Others discuss the “just cause” requirement by invoking different conceptions of the world community.  At one extreme, the world community is inter-national, a community of nations or sovereign states relating to one another by mutual agreement with one another; an opposing conception thinks of a global ethical order of trans-national norms about people (that is, human rights).  In effect, some of the debate is couched in broad issues of how state-centered or people-centered the world ethical order is to be.

ii. Right Intention and Right Authority

Intention, or purpose, and authority have both been basic considerations in determining whether the recourse to military force is justified.  Aquinas’s first articulation of just war theory includes right intention and “right authority” as requirements.  Matters of intention, or purpose, have since not always been accorded independent status.  For example, Grotius does not list intention as a separate requirement for justly going to war, and later versions of just war theory often seem to conflate “right intention” and “just cause.” In the application of just war theory to humanitarian interventions, however, the “right intention” requirement figures prominently in many discussions.  As noted in sec. I, the issue emerges as a matter of definition, and some maintain purity of motive is essential to being a humanitarian intervention.  Others note that the classic just war tradition mostly excludes certain aims or purposes for going to war – “not out of greed or cruelty, but for the sake of peace, to restrain the evildoers and assist the good,” writes Aquinas.  So, the classic “right intention” requirement, it is argued, allows for a plurality of motives for waging war so long as excluded are such aims as conquest, territory, control of natural resources, and vengeance.  When applied to humanitarian interventions, then, use of military force satisfies the “right intention” requirement if, among a plurality of motives, a primary purpose is addressing the widespread and systematic human suffering (Wheeler, 37-40).

The classic just war tradition emphasizes issues about the locus of authority to deploy military force.  Modern just war theory typically presumes that states are the proper right authority.   The advent of non-state war-makers – terrorist organizations, liberation movements, insurgencies, and insurrections, for example – raises interesting questions for this jus ad bellum issue.  In applying the just war theory framework to humanitarian interventions, however, the “right authority” requirement is prominently discussed, often in the context of international law and institutions.  Under the United Nations Charter, except for wars of reactive self-defense, a state is explicitly permitted to employ military force against another only with Security Council authorization to “preserve international peace and security” (see IV.A. below).  For this and other reasons much discussion is devoted to whether interventions are justifiable when unauthorized by the United Nations (and thus, illegal).

There also are some ethical dimensions to “right authority” questions about humanitarian interventions.  In as much as impartiality is an ethical norm, there may be a strong presumption for only centrally authorized or multi-lateral interventions being justifiable.  In as much as speed of response to a supreme humanitarian emergency saves more lives and a single state can be more decisive, there is support for permitting unilateral and unauthorized interventions.  In as much as there are moral objections to the current, restrictive international law of force, states’ or international organizations’ unauthorized interventions may have some moral merit as a way of reforming the law or as a justified cost of promoting basic justice or protecting basic human rights.  In as much as the quality of the intervention is affected by characteristics of the intervening party – suitable military capability, quality command and control infrastructures, experience, a good human rights record – perhaps only certain states or organizations satisfy the “right authority” requirement for justifiable humanitarian interventions.

iii. Likelihood of Success, Last Resort, and Proportionality

Three additional jus ad bellum requirements also must be satisfied for a war to be justified.  As applied to justifying an armed intervention, then, using military force to address a humanitarian emergency must be likely to succeed.  As in just war theory, the principle of likely success presupposes a sufficiently clear understanding of “just cause” and “right intention.”  For humanitarian interventions, then, success is at least preventing or stopping the widespread violence and suffering that constitutes “just cause” and defines the purpose of the incursion.  If such success is not likely, then the intervention is not justified.  Aside from the inherent vagueness of the standard, estimating the likelihood of a successful intervention is complicated, a function of at least two general factors (among others): the military capabilities and effectiveness of the intervener, and the capabilities of the target state or other forces involved in the violence that constitutes “just cause.”  The latter can be further complicated by the need to estimate secondary effects: for example, whether an armed intervention may invoke target state allies’ military mobilization and responses, with looming possibilities of a larger conflict.  As even proponents of intervention admit, some interventions will not succeed and “some human beings simply cannot be rescued except at an unacceptable cost” (International Commission, sec. 4.41).  It also follows that inequality of military power among states is normatively significant.  A humanitarian intervention is not likely to succeed against large, powerful states, like China or Russia, while success is more likely for emergencies occurring in smaller, weaker nations; furthermore, large, militarily powerful states are more likely to be successful interveners than smaller, weaker nations or organizations.  Thus, inequalities of states’ military power create inequalities of immunity and vulnerability to justifiable armed interventions; and power differentials create inequalities of moral right, responsibility, or duty to intervene in response to human suffering around the world.

The last resort requirement expresses the general idea that war is worth its deadly, destructive effects only if every non-military alternative will not work to achieve the same ends  (that is, what counts as success, which is linked to “just cause” and “right intention”).  Though the general idea is more than plausible, specifying the “last resort” requirement precisely is controversial because one can almost always argue that not all non-military avenues have failed: more diplomacy or negotiation almost always seems possible.  Indeed, on a literal construal of the requirement, no war ever satisfies this jus ad bellum requirement.  On the other hand, in law and morality, reactive wars of self-defense are justified, even though non-military means of resolution, in fact, are not attempted after an armed attack occurs.  So, justifying a war as a last resort depends on at least two features of the specific situation: time, or urgency of action, and likelihood that non-military measures would succeed.

Supreme humanitarian emergencies often exhibit urgency akin to that of a state facing a surprise attack or invasion: if lives are to be saved and people to be rescued, there is not time for peaceful pressure, coercion, diplomacy, negotiation, sanctions, or boycotts to work effectively.  The Rwandan genocide of 1994 vividly illustrated this sort of urgency.  Aside from temporal considerations, intervening as a last resort involves assessing the likelihood of any non-military means being effective, but not necessarily actually implementing or trying all those means.  This leads to typically counterfactual formulations of the “last resort” requirement, as illustrated by the ICISS Report, The Responsibility to Protect.

[Last Resort] … does not necessarily mean that every [non-military] option must literally have been tried and failed…. But it does mean that there must be reasonable grounds for believing that … if the measure had been attempted it would not have succeeded. (sec. 4.37)

This way of proceeding points to a second temporal feature of war as last resort.  Some opponents of intervention bemoan the lack of infrastructure, for example, that would enrich and support effective, non-military means of defending and protecting basic human rights.  The idea is that more could have been done to prevent horrors and to be ready to react non-militarily when emergencies do emerge.  An issue here, then, is the time framework for constructing possible means of addressing the emergency.  A circumscribed last resort principle requires assessing the effectiveness of those means available at the time of the emergency, however rich or limited they may be.  A broader last resort principle seems to deny armed force is a last resort if more could have been done in the past to enrich the availability of effective non-military means today.  This broader construal, however, seems to conflate a call to build better prevention mechanisms with assessing military and non-military options available when supreme humanitarian emergencies actually occur and decisions have to be made.

The jus ad bellum proportionality requirement is often labeled “(macro-)proportionality,” to distinguish it from the in bello proportionality, or (micro-)proportionality principle.  The ad bellum principle addresses the general concern that the deaths, destruction, and other negative effects of war must be balanced by its benefits (that is, success).  In considering war’s effects the proportionality principle precludes excessive partiality.  So, a war’s effects on everyone are to be counted – civilians and combatants, whether friend or foe or neutrals civilians.  All death, injury, and destruction are to be considered, and relevant effects must not be limited to one’s own national interest and do include the international community.  This breadth of considerations brings to the fore difficult matters of the commensurablity of values and, as for any consequentialist argument, epistemic challenges related to the causal impacts of action.  Yet some rough estimates of wars’ costs and benefits can be and have been plausibly made.  But a few thousand armed soldiers quickly deployed to Rwanda in April, 1994, would likely have saved many, many lives, whereas militarily stopping the suffering of Chechnyans or Tibetans would very likely bring exorbitant costs in death and destruction.  In other cases the benefits of an armed intervention includes rescues of suffering peoples, a cost might be significant eroding of the stability and order of a system of states on the planet.  The idea is that vigilante justice by state militaries has costs to the international system’s order, stability, and peace, costs that are not balanced adequately by the reduced suffering of people in a particular nation or region.  Michael Walzer, like other just war theorists, concludes that the proportionality requirement “… is a gross truth, and while it will do some work in [some] cases …, it isn’t going to make for useful discriminations in the greater number of cases” (Arguing 90).   Contributing to the challenges for the proportionality requirements are controversies about its structure: for example, whether an adequate macro-proportionality requires minimizing the bad effects, maximizing the net benefits, or minimizing the balance of benefits over bad effects.  Applied to justifying armed interventions, then, the macro-proportionality requirement speaks to a central concern, but cannot reliably discriminate finely among many humanitarian emergencies that arise.

b. Justifying Conduct in War (jus in bellum) and Justice after War (jus post bellum)

The general idea of proportionality is one that links the traditional division of just war theory into jus ad bellum and jus in bello principles.  The latter just war requirements govern how a war is to be conducted:  proportional means are to be used, and non-combatants are to be distinguished from combatants in waging war.  Given the just war theory framework for justifying humanitarian interventions, these in bello considerations are relevant and applicable to uses of military force to address humanitarian emergencies.  If an armed intervention is be a (fully) just war, then, the rules of engagement (ROEs) need to reflect both in bello principles. These principles raise many issues for just war theory and some challenging ones for the morality of interventions.

The in bello micro-proportionality requirement governs military operations during a war.  The general idea is to minimize the armed force used, and destruction caused, in order to attain a militarily necessary objective.  But unlike the ad bellum the macro-proportionality requirement, in assessing the effects of a military operation, it matters much who benefits or suffers.  First, combatant and non-combatants are to be distinguished.  This is the in bello principle of discrimination.  As Michael Walzer expresses it, the general idea is that wars are waged between combatants: non-combatants “are not currently engaged in the business of war” and thereby are “outside the permissible range of warfare” and carry an “immunity from attack” very much unlike combatants.  Though being more specific about the distinguishing criteria or about the permissibility of some non-combatant casualties (that is, as “collateral damage”) is controversial and complicated (See much of Walzer’s Just and Unjust Wars, for example), in estimating the consequences of a military operation in war, one is to count much more any ill effects on non-combatants: who suffers matters much.  Second, in war and interventions it is permissible to count more the costs to one’s own forces than losses to opposing combatants.  The notion of “force protection” becomes morally acceptable, at least within some limits: the conduct of the war is justifiable even if operations distribute risks more to opponents’ forces than to one’s own forces, provided there are “no more casualties than necessary inflicted on the other side.” But third, those force protection limits for intervening forces can become problematic. As illustrated by the Kosovo intervention of the 1990s, high altitude bombing effectively reduces the interveners’ losses while also increasing the costs to non-combatants on the ground.  How much “collateral damage” to non-combatants (and non-military property) is morally permissible in order to reduce risks to one’s own forces?  How much death and damage to opponents’ military force is not excessive and is morally permissible in order to achieve humanitarian ends?

The traditional in bello requirements of just war theory leads to challenges for this approach to the morality of humanitarian interventions.  George Lucas, for example, argues that “the use of military force in humanitarian cases is far closer to the use of force in domestic law enforcement” than it is to waging war.  Interveners are there to protect and defend, akin to the mission of a police force.  Seeing this more constabulary role of intervening forces entails that “international military ‘police-like’ forces (like actual police forces) must incur considerable additional risk, even from suspected guilty parties,” while, like domestic police forces, “refrain from excessive collateral damage, … the deliberate targeting of non-combatants, … [and] engaging in violation of the law.”  These are “far more stringent restrictions in certain respects than traditional jus in bello” requirements.  Indeed, these stringent restrictions apply even if interventions are seen as “saving strangers” and the mission seen as a rescue.  Thus, Lucas concludes, “the attempt to assimilate or subsume humanitarian uses of military force under traditional just war criteria fails.”  Interventions “are sufficiently unique as to demand their own form of justification, … jus in pace, or jus in intervention,” and specific, substantive requirements for interventions are proposed in a structure parallel to the traditional just war framework of jus ad bellum and jus in bello principles.

A third major facet of just war theory, jus post bellum – literally “after war” – is also sometimes a framework for examining the morality of humanitarian interventions. The 18th century work of Immanuel Kant, in Perpetual Peace and elsewhere, is often credited with originating the notion of jus post bellum, though Vitoria and Suarez both earlier distinguish this facet of just war theory.  The roots of the notion are embedded in the classic “right intention” requirement that the aim of justifiable war is to be peace.  How one ends a war – even a just one – affects whether peace will follow, for how long, and the structure of the peace that will or should be.  For example, is a just end of war establishment of the status quo ante bellum, which is perhaps plausible for wars of self-defense against an invasion? Or ought a war end by establishing “peace with justice?” And what might such justice require – unconditional surrender, reparations, repatriations, disarmament, punishment of perpetrators, structural adjustments in the distributions of land or wealth, establishment of democracy, restorations of relationships?  In international law and among just war theorists, this third major component of just war theory has received comparatively little attention. (One important exception is the work of Brian Orend.) Jus post bellum issues are important to the morality of humanitarian interventions. As C.A.J. Coady cautions, considering interventions requires specifying not only from what it is people are to be rescued, but also for what it is that they are rescued (in Chatterjee & Scheid, 291).

Jus post bellum considerations lead to tensions and challenges for thinking about the purposes and morality of humanitarian interventions.  For example, given the nature of supreme humanitarian emergencies, stopping the violence leaves a great need for extremely difficult reconciliation processes, a facet of rebuilding a functioning social order.  In addition, to prevent recurring violence the root causes may need to be identified and addressed, which likely involves major changes in a society’s basic structure and institutions.  Perhaps justice requires some punitive action towards perpetrators and accomplices, whether heads of state, government officials, or local militia leaders and private citizens.  Arrests, war crimes trials, truth commissions, and the like may be warranted for what is sometimes called “transitional justice.”  A concern is that seeking retributive justice can counter needs for reconciliation and matters of restorative justice, as can redistributions of wealth, land or political power to address root causes of the violence (Govier, Orend).  The ICISS Report, The Responsibility to Protect, identifies such issues as elements of “the responsibility to rebuild.”  A long-term aim of genuine peace, then, generates complex questions of how such peace relates to other important aims, such as justice.

These kinds of post bellum considerations effectively broaden and perhaps challenge just war thinking about the morality of humanitarian interventions.  For example, if the ad bellum success requirement is more than rescuing, defending, or protecting victims, but also includes justice (retributive, distributive, restorative) or rebuilding, then the challenges of success are much greater, the likelihood of success is much less, the capabilities for success (including political will) are rarely available.  It would follow that virtually no interventions are justifiable by just war standards:  more often many more people will be beyond rescue than what follows from narrower understandings of what a successful intervention entails.  Second, these broadening, post bellum considerations challenge the very conception of interventions as rescues, as “saving strangers.”  It makes interventions perhaps more akin to a police action, with attention to arrest and enforcement, or more like a mission to establish peace with justice, or more like a complex, long-term humanitarian aid program of which one significant dimension is the use of armed force.  On the other hand, one can look at the responsibility to rebuild or seek justice post bellum as a distinct phase following the humanitarian intervention proper.  A fully just use of military force – even seen as rescue, protection, or defense – may require that some organization or states address post bellum issues and rebuilding once the violence is ended, but those needs need not be addressed by the interveners themselves.  Just war thinking then requires that interveners use military force with consideration of post bellum requirements, but the post-intervention missions need not be the action of the rescuer themselves.  Some such distribution of responsibilities may be, for example, what is envisioned by the ICISS Report, The Responsibility to Protect.

c. Some Implications of Justifying Humanitarian Intervention

Just war theory, in its entirety, articulates appropriately high standards for morally judging war and for justifying humanitarian interventions.   Even the ad bellum standards more frequently addressed are not all easily satisfied (a point sometimes insufficiently appreciated due to the excessive focus on threshold conditions, or “just cause”).  There are some significant implications of the just war framework for assessing the moral justifications for humanitarian interventions.  There are, for example, daunting epistemic issues in establishing that threshold conditions are satisfied or in assessing the complex consequences of an intervention.  The latter is only aggravated by the near certainty of unintended consequences for military campaigns and by the frequent situation of estimating effects of using armed force in foreign lands and cultures.  As already noted, given the inequalities of military capability among states that affect interveners’ likelihood of success, just war thinking leaves some nations much more vulnerable to interventions for mistreating their own people, while other states can violate human rights with impunity from others’ use of military force to stop the violence.  The same inequalities of capability result in an unequal distribution of the right or responsibility to intervene militarily on humanitarian grounds, with all the attendant costs of such interventions.

There is a basic deontic category issue in exploring the moral merits of humanitarian interventions via just war theory.  Is a justified war a matter of a right, responsibility, or duty?  And what kind of right or duty is signaled by establishing that a war is justified?  Parallel questions then apply to justified humanitarian interventions:  are they a matter of a right to use armed force, or a responsibility or duty of some kind?  What kind of right or duty, then?  Addressing such questions from a just war framework intersects with varying conceptions or analogies employed in discussing interventions.  For example, one might consider interventions in defense of others as a right associated with rights of self-defense.  Associated with individuals’ right of self-defense is a right to use deadly force in defense of others.  In parallel fashion, then, associated with states’ right to wage wars of self-defense would be a right to use military force in defense of others.  Such a right of defense – of self or of others – is one that the right-holder chooses whether to exercise or not: just as there is no duty to fight in self-defense, then, there is no duty to use deadly force in defense of others.  Humanitarian interventions, then, are a matter of moral right, not duty or obligation; and they are what are called liberty-rights or discretionary rights of intervention.  In as much as jus ad bellum principles identify when there is such a right to wage war, then they can be used to identify when there is a moral right to intervene militarily for humanitarian purposes.

In contrast, armed interventions are often portrayed in ways suggesting there is a duty to use military force to address humanitarian emergencies.  A common conception is the notion of interventions as rescuing others.  The “rescue” metaphor suggests using military force is an imperfect duty of individual beneficence or charity, of rendering aid to strangers facing life-threatening situations, or what are sometimes called “Good Samaritan” duties.   Such imperfect duties are not correlated with others having a right to be rescued, and wide discretion is accorded the obligated as to when, where, and how to discharge the duty.  As we have seen, though, the “Good Samaritan” analogy may be a strained one, at best:  states or international organizations’ interventions are not relevantly or sufficiently akin to that of “saving strangers” as if a tragic accident had befallen them.  It has also been suggested that interveners are more akin to a police force, which suggests that justified interventions are discharging a duty to protect and defend others in grave danger.  A humanitarian intervention, it would seem, is justified under conditions analogous to those for domestically dispatching S.W.A.T. teams.  A challenge for either of these conceptions of interventions – as rescue or as constabulary – is a dissonance between a moral duty to use military force for humanitarian purposes and the kind of moral justification for waging war the jus ad bellum principles provide.  Does just war theory establish a moral duty to wage war?  If not, how can jus ad bellum principles ever support a duty to intervene militarily? To speak of a moral duty to wage war is today not obviously plausible.  The notion of a duty to wage war may be consistent with some classic contributors to the just war tradition, such as Aquinas. Late 20th century theorists, like Michael Walzer, argue that sometimes there is a duty literally to combat evil.  But the idea that just war theory establishes duties to wage some wars is controversial and defended by some for only quite unusual circumstances, of which, of course, a supreme humanitarian emergency may be one.  As some have argued, jus bellum can establish at most the moral permissibility or right of intervening; additional considerations are needed to establish humanitarian interventions as morally obligatory.

Other proponents of humanitarian interventions argue for a duty to intervene based on taking human rights seriously, as a duty correlated with people’s basic human rights not to be tortured or killed arbitrarily.  The general idea is that, as moral claim-rights held by all, human rights entail everyone has a duty to protect and promote the human rights of everyone else (See IEP article, “Human Rights,” sec 3).   One form of the argument is that, as a matter of international law, practice, and practicality, these correlative obligations fall largely upon national governments and international organizations (for example, the United Nations).  Others argue more forcefully that the logic of basic human rights establishes correlative duties to respect, protect, and defend.  In Basic Rights (Princeton, 1980), Henry Shue famously argues that a basic right such as the right not be killed arbitrarily entails not only duties not to kill, but duties to protect or enforce the right: a negative right such as the right not to be killed arbitrarily requires positive action by others (as do positive rights to subsistence).  Furthermore, Shue argues later, if and when the primary holder of the correlative duties (that is, the state) fails to meet its obligations, then the duty to protect and defend human rights defaults to others.  Thus, humanitarian interventions are justifiable as discharging a (default) duty to protect and defend basic human rights not being respected by the target state.  At least in its conclusion, the ICISS Report, The Responsibility to Protect, advances a similar view about interventions.  Another, related argument to support a duty to intervene derives from theories of global distributive justice.  For example, Allen Buchanan argues that a natural duty of justice obligates all to do what we can “to help create structures that provide all persons with access to just institutions, … where this means primarily institutions that protect human rights” (Justice, 85ff.).  Arguments appealing to rights and correlativity relations are not uncontroversial.  For those who distinguish positive and negative rights, for example, the correlative duty for a right to life is simply and only not to kill.  Thus, so long as a state or international organization is not the perpetrator of atrocities against its own people, it would seem that correlative obligations have been satisfied without coming close to establishing military intervention as a duty.  An issue in the background is what one takes to be the model for understanding human rights and the extent to which duties of respect and protection correlate with those rights and for whom or what.

4. Other Issues and Challenges

Just war theory has been the most prominent framework for philosophic discussion of the morality of humanitarian interventions.  Other relevant approaches include attention to international law and its ethical implications and an issue central to political philosophy, the concept of state sovereignty.  Among the most powerful and prominent objections to interventions are those based on state sovereignty and on what is called “the selectivity problem.”  Some alternative frameworks for considering humanitarian interventions are actually challenges to just war theory itself.  Political realisms deny the applicability of moral norms to state behavior, including uses of military force.  Pacifism typically denies the premise of just war theory, namely, that some wars are morally justifiable, even if waged for humanitarian purposes.

a. International Law and Ethics

Much discussion of humanitarian interventions involves legal issues under the Charter of the United Nations, the central and paramount text of the international law of force.  Philosophers of law have accorded relatively little attention to international law.  Questions about the legality of interventions, however, exhibit significant philosophic and ethical dimensions, even setting aside here many matters of analytic jurisprudence and whether international law constitutes a genuine legal system, whether there is such a thing as international law at all (See IEP article, “Philosophy of Law”).  Attending to the international law of force and human rights involves issues of interpretation, sources of law, ethics of acting illegally and reform, as well as the extent to which states or people ought to be at the center of the system.

At the center of the international law about interventions are explicit provisions of the United Nations Charter and human rights treaties.  Proclaiming “the sovereign equality of all states,” the Charter permits states to use armed force only in self-defense, prohibits states’  “threat or use of force against the territorial integrity or political independence of any state,” prohibits intervening “in matters which are essentially within the domestic jurisdiction of any state,” and allows the Security Council to authorize uses of armed force only if domestic strife or brutalities also constitute “threats to international peace and security”  (Articles 2.1, 51,  2.4, 2.7, and 39, respectively).  The text seems unequivocally clear:  unauthorized humanitarian interventions are illegal.  And since humanitarian emergencies typically do not threaten international peace and security, the text permits authorizing few, if any, interventions.  Furthermore, the nine core human rights treaties and the 1948 Universal Declaration of Human Rights explicitly require only that each state respect, protect, and enforce the provisions listed, such as rights to life.  The 1948 Genocide Convention requires that signatories “prevent and punish” the “crime of genocide,” but the only explicitly permissible means is via “the competent organs of the United Nations.”  The 1998 International Criminal Court statute (called the “Rome statute”) makes its authority largely dependent upon and only complementary to individual states’ enforcements.  Human rights, for example, may be transnational norms, but international law makes the respect, defense, and protection of those rights almost exclusively a domestic matter for each state.  So, to promote international peace and security, inter-state uses of armed force are severely limited by law, even when domestic violence against people may be widespread and systematic.

There are ethical dimensions to the system of international law as it relates to interventions. The United Nations Charter has been accepted by consent of all and each members of the United Nations – virtually every state on the planet.  State consent is among the established procedures for creating international law and consent creates compliance obligations for states.  Legal positivists hold that there are (ethical) obligations to obey laws enacted according to established procedures (See IEP article, “Legal Positivism”).  So, for positivists, it follows that states and international organizations ought to obey the law and therefore ought not conduct unauthorized humanitarian interventions.  Though each state ought to respect, protect, and enforce human rights, relevant international law texts do not provide a legal basis for unauthorized or for authorizing interventions, even as a way to stop a state’s violating its own treaty obligations by mistreating its own people.   Legal positivists maintain that there are legal and moral obligations not to interfere militarily with the domestic affairs of states, even in the face of a humanitarian emergency.

Challenges to this line of reasoning take several forms.  First, some legal scholars quite carefully parse the specific Charter texts in ways consistent with humanitarian interventions being permitted.  For example, Article 2(4) does not prohibit all uses of military force, but only those aimed at the independence or territory of another state.  Legally permissible, then, would be any humanitarian interventions having neither those aims nor those effects.  Disagreements about interpretation raise philosophic issues about how best or properly to interpret legal texts.  Some dispute such textual parsing as ignoring the original intent of the language.  Others deny that original intent is probative, granting a more significant role for contemporary attitudes, beliefs, and norms about interventions, or appealing to a political morality implicit in legal texts and their interpretive history.  A second area of disagreement about legalities attends to considerations of non-textual sources of law, or what is called “customary international law.”  Analogous to common law in domestic legal systems, general state practice accepted as law is evidence of a rule of customary international law (Article 38(I), Statute of the International Court of Justice).  Some argue that long-standing state practice has established a customary right of humanitarian intervention; others deny this claim of fact about state practice, or assert that the written law of the Charter supersedes any putative customary rule.  In effect, there is much controversy about what H. L. A. Hart famously has called “a rule of recognition” for the system of international law, especially customary law.  One final issue deals with the ethics of acting illegally.  At the heart of creating customary law about interventions is establishing a state practice of intervening.  This requires that states begin creating a custom by acting in ways neither required nor permitted by international law at the time: legality is created over time only by a process initially requiring illegal actions.  Given sufficient moral grounds for reforms to permit humanitarian interventions, then, a moral argument can be made for illegally intervening now to address emergencies and thereby contribute to reforming international law.  Unauthorized humanitarian interventions then can be seen as a kind of international civil disobedience by states or international organizations.

b. State Sovereignty and Intervention

State sovereignty is a major issue for humanitarian interventions, whether as source of opposition or of significant challenge for proponents.  For centuries the general idea has been that a sovereign state has supreme authority over its territory, its people, and its relation with other states; and so, other states or organizations are not to interfere with exercises of that supreme authority.  Matters of sovereignty have been central to political philosophy, international relations, international law, and the institutions and practices constitutive of the modern world order.  Prima facie humanitarian interventions challenge state sovereignty and the international system of non-interference in states’ domestic authority.  The literature is vast, the issues complex, the notion of sovereignty contentious and controversial to the core.

The 16th century French thinker, Jean Bodin, in his Six Books of the Commonwealth (1576), is credited with coining the term ‘sovereignty’ to denote a state’s supremacy of authority within a territory and population.  Subsequent political philosophers, like Locke, Hobbes, Rousseau, and the utilitarians, have focused much on the source, locus, and limits of sovereignty within a state, while merely acknowledging an accompanying externally directed authority to make war, peace, alliances, and treaties with other powers.  The idea of sovereignty as independence from others’ interference is tied originally to the 1648 Treaty (or Peace) of Westphalia and develops into a strong principle of non-interference during the 19th century.  Simply stated, then, state sovereignty involves supremacy and independence of authority with respect to internal matters and with respect to relationships with other powers, including the absence of non-consensual interference by other sovereign states or other organizations.  The “Westphalian system” is an order of mutually independent states excluded from interfering in one another’s domestic affairs.

Whether authority is seen as effective control or as a right, the merits of sovereignty as independence are mixed.  For example, state sovereignty can express and protect a people’s collective right of self-determination in matters political, social, and cultural.  A plurality of independent sovereign states accords appropriate diversities among peoples of the earth; a system of non-interference promotes an international stability and order.  The sovereignty of states is sometimes portrayed as akin to that of individual persons, coupling autonomy rights over their own life, independence from external control, and mutual, reciprocal duties not to interfere with others.  Also, state sovereignty is long embedded in international law.  As mentioned above, the United Nations Charter, for example, asserts the equality, independence, and freedom from external interventions in states’ domestic affairs.  In contrast, it is argued, taken too far, sovereignty precludes any international law at all, since supremacy and independence is reduced by any transnational legal rules limiting war or breach of treaties, for example.  Similar reasoning leads to concerns that sovereignty precludes appeals to transnational moral norms, such as, for example, natural law duties or universal human rights.  Some argue that state sovereignty is not limited by, but literally constituted by international law:  there is no sovereignty outside the legal system that constructs it and, thus, the contours of state authority change as the content of international law changes.  For example, current international law prohibiting torture, genocide, or disregard for basic human rights effectively redefines the scope of authority accorded states: such acts are not expressions of sovereignty, but abuses.

An often neglected line of argument shows that states themselves express their sovereignty sometimes in order to limit the scope of their own sovereignty – what S.I. Benn long ago called “auto-limitation.”   Robert Keohane makes the same point:  “…[S]overeignty is quite consistent with specific restraints.  Indeed, a key attribute of sovereignty is the ability to enter into international agreements that constrain a state’s legal freedom of action” (in Holzgrefe, 283-284).  On almost any account, state sovereignty includes the right to enter into treaties, just as personal autonomy rights can be expressed by making promises or signing contracts that obligate, bind, and limit future actions.  If states have freely chosen to sign human rights treaties, for example, or the Genocide Convention, or the United Nations Charter, then, through that expression of their sovereign authority, they have limited what is within their authority later, for example, committing genocide, waging aggressive war, disregarding the outcomes of established procedures or processes.  Then states involved in humanitarian emergencies are abusing, not exercising the sovereign authority they chose to limit.  Though more controversial and problematic, such “auto-limitation” may also apply to outcomes of procedures yet to be implemented.  So, for example, since provisions of the Charter allow for humanitarian interventions by Security Council authorization, any member state’s sovereignty is not violated by duly authorized outsiders using armed force to rescue, defend, or protect that state’s people from abuse.

Discussions of humanitarian intervention have led to alternative ways of thinking about state sovereignty.  One line of thinking makes sovereignty of a state conditional and contingent (Holder, 89-96).  A state has genuine sovereignty only if it meets minimal moral requirements, such as effective control in maintaining order and security, or avoiding egregious mistreatment of its people, or, less minimally, reflects the political will of the people themselves.  So, failed or grossly abusive states, for example, have no sovereignty and, thus, an otherwise justified humanitarian intervention does not violate the target state’s sovereignty. Robert Keohane has proposed that sovereignty rights need to be seen as separable, so that a state, based on certain criteria, retains some kinds of authority while losing others.  Sovereignty rights can be “unbundled”  and they admit of gradations.  So, exclusion of external control over territory may be an authority lost by a state, but that same state may continue to have some limited domestic authority at the same time.  Third, is a proposal to see sovereignty as states’ responsibility, a kind of duty to protect all people’s human rights.  As described in The Responsibility to Protect, states failing to protect their own citizens’ rights temporarily forfeit sovereignty rights, others’ duties of non-interference are suspended, and then other states or organizations assume the responsibility to protect persons by intervening, perhaps even militarily.  That sovereignty includes domestic duties to respect and protect peoples’ rights is a feature of classic social contract theories of state authority, such as by Locke, Kant, or even Hobbes.  The proposal, though, controversially maintains that each state’s sovereignty includes a responsibility to protect not only the rights of its citizens, but the rights of aliens in other lands, a responsibility of “saving strangers.”  These alternative approaches all depart from the letter of international law about qualifying for state sovereignty.  They are an extension of a greater emphasis on human rights as transnational moral norms.  Alternative, normative understandings of the modern state show that, under certain conditions, humanitarian interventions are not violations of sovereignty at all.

c. The Problem of Selectivity

Among the most common objections to humanitarian interventions is “the vexed issue of selectivity.”  The concern is that states or international organizations choose to intervene militarily in only some humanitarian emergencies: only some sufferings are selected for forceful action by outsiders.  Some critics, such as Noam Chomsky (The New Military Humanism, 1999), see selectivity as undermining any and all moral merit to military interventions to protect basic human rights.  If humanitarianism is the issue, why intervene here and not there?  How does it come about that armed interventions take place in one crisis but not in another?  How can it be morally acceptable that, though there are many emergencies warranting others’ forceful response, only some situations are selected for armed interventions and only some people’s basic human rights are defended by others’ military force?

The objection is sometimes seen in terms of ethical consistency:  among all those situations where a humanitarian intervention is morally justifiable, in only some of those cases is an intervention conducted.  It would appear that like cases are not being treated alike.  The implicit appeal is to the universalizability of genuine moral judgments. But a lack of clarity or precision often cloaks the objection.  Sufficient sufferings by people – threshold conditions – are only one feature of similarity between cases, and appeals to similarities of sufferings do not alone make the compared cases morally justified.  There are other necessary conditions to justifying an intervention (for example, likelihood of success) and sometimes those are not met in the cases being compared.  The selectivity problem arises only when one or more situations satisfy all the requirements for justifying uses of armed force.  Second, interventions being morally justified is not inconsistent with only some interventions taking place.  If being morally justified means there is a right to intervene, then, as with most rights, the right-holder can choose whether to exercise the right or not, whether to actually intervene or not.  If being morally justified means there is an imperfect duty to rescue, then, as an imperfect duty, the obligated parties can choose when and where to discharge the duty.  Third, understood as ethical inconsistency, selectivity seems hardly sufficient reason to reject all humanitarian interventions as unjustified.  The moral flaw of inconsistency does not require doing nothing: because one cannot or does not do everything morally justified on similar grounds, it does not follow one ought not ever do what is morally right.

A second version of the objection points to the substantive criteria by which interventions are selectively conducted, and there is something right about this form of the objection.  It is problematic if, among an array of justifiable interventions, states select only some situations for intervening based on morally suspect criteria, such as regional bias or media attention (what is called “the CNN effect”).  For example, in the 1990s military force was employed in the Balkans for humanitarian purposes, but not in Rwanda.  Assuming both situations warranted intervention, the issue, then, is not only ethical inconsistency, but suspicions about the ethical acceptability of the substantive criteria for selective action.  So, for example, if there is moral right to intervene, is it not morally problematic if the right-holder chooses whether to intervene based on the race or region of those people suffering? If there is even an imperfect duty to intervene, is it not morally problematic to select those to be rescued based on whether they are European or Christian, or based on the extent and kind of media coverage provided?  This version of the selectivity problem has merit in calling for diligence, discipline, and care in choosing how to exercise rights or discharge duties.  It is not clear this version of the selectivity problem is sufficient reason to oppose all humanitarian interventions, unless the reliance on morally suspect criteria is pervasive or even unavoidable.  And that leads to a third and the strongest version of the selectivity problem for humanitarian interventions.

The claim is that states selectively intervene based on national self-interest, not based on humanitarian need or warrant.  Though seldom distinguished by critics of interventions, a weaker version of the objection is that, among morally justified interventions, states choose to intervene in those situations that serve their national interest.  A kind of national prudence supervenes on the array of morally permissible interventions.  It is not obvious that this is problematic for states any more than it is for individuals who invoke principles of prudence to choose among morally permissible possibilities.  And it does not seem sufficient to reject humanitarian interventions as unjustified, even if, in fact, all states do combine moral and prudential consideration in selecting sites for intervention.  A stronger version of the objection, however, reflects concerns about imperialistic ambitions or hegemonies of intervening parties.  One can see this objection as a skepticism about what genuinely drives states’ decision about whether to intervene or not.  The selectivity objection, then, is not so much concerned about moral flaws or inconsistency, but relies on the inescapable role of national self interest in deciding whether to intervene.  Seen this way, the objection reflects political realism as an alternative framework for considering the international arena, including states and humanitarian interventions.

d. Political Realism

Political realism takes many forms, none of which support independent ethical norms as relevant to international relations, including states’ uses of armed force or war (See IEP article, “Political Realism”).  Strong forms of descriptive realism maintain that all state action is in pursuit of national self-interest, typically understood in terms of national security, military or economic power, or material well-being.  If all states’ actions are, in fact, motivated by self-interest, then state actions motivated solely or primarily by humanitarian considerations are not possible or morally justifiable.  Such a strong form of descriptive political realism, however, is a dubious empirical generalization about international relations and about the scope and stability of what constitutes national interest.  There are examples of states’ cooperating, of states sometimes acting on moral grounds, of states sometimes acting contrary to their national interest, or of states changing what constitutes their national interest (Buchanan and Golove, 873-874).  Prescriptive political realism maintains that states should pursue their own national interests in the international arena: it advocates a norm of prudence, of pursuing self-interest, but not of morality, as properly governing state behavior.  According to realisms, then, uses of armed force in defense of others’ human rights sometimes occur because it is in the national interest of the interveners.  An example might be an intervention conducted in a bordering state, due to the national security interests threatened by having an inept or failed state as neighbor (for example, refugees and interruptions of oil or water supplies).  The justifications for intervening, if any, are not moral principles, but appeals to promoting the intervener’s national interests, which is, the realist maintains, the way states do or should act.

Political realists’ amoralism about states’ actions typically correlates with a model of the international arena as analogous to Hobbes’s state of nature (See IEP article, “Social Contract Theory” and “Thomas Hobbes: Moral and Political Philosophy”):

  • There is no global power, no supreme power to enforce cooperation and peace;
  • there is relative equality of power among states;
  • each state should pursue its self-interest, by any feasible means, including by anticipatory domination of other states when possible;
  • and there is no morality applicable amidst pervasive mutual assurance problems.

In contrast, supporters of just war theory, universal human rights, and morally justified humanitarian interventions typically see the international arena as more analogous to Locke’s state of nature (See IEP article “John Locke: Political Philosophy”).  There exist transnational moral norms (for example, of human rights, justice) that bind states and organizations in their relations to one another, including perhaps an international analogue to the Lockean executive right to punish and enforce those transnational norms, even by use of armed force.  Though the proposed content of the transnational moral principles may vary, the relevance and applicability of moral principles opposes realists’ amoralism about international relations.  Political realisms and defenders of morally justified wars or humanitarian interventions reflect fundamentally different conceptions of world order.

e. Post-Colonialism and Feminism

Post-colonialism’s attention to issues of power, representations in discourse, perspective, and history provides an alternative approach to issues of war and humanitarian interventions.  For example, the selectivity issue (IV.C. above) is seen as about abuse of power, and about the discourse of rights, law, and “just war” masking imperialistic ambitions or hegemonies of intervening parties.  Examples of abuse abound, it is argued, from the days of the Cold War to later incursions in the Middle East and Afghanistan (Gregory).  The moral universalism of human rights and other concepts employed as intervention threshold conditions (II. above) are not neutral, with their emphasis on the individual, on negative civil rights, and on the rule of law.  The discourse of just war thinking looks at uses of military force from the perspective of those deciding whether to wage war, not from the perspective of those against whom the war is waged or those who are suffering.  Intervening parties, it is argued, are former colonial powers with lingering imperialist ambitions and those to be protected are former subjects of these imperialist ambitions.  Given the asymmetries of power in the world, colonialism and imperialism continue in the way in which dominating powers structure and influence the lives of those around the world, so much so that there are nearly insurmountable obstacles for the subaltern speaking and being heard (Spivak).  Post colonialist approaches call for skepticism, at least, about moral justifications for war or armed humanitarian interventions; and they call for involving diverse, alternative voices and thinking in response to human suffering.

Feminist thinking about humanitarian interventions includes challenges to the substance and implications of employing the “just war” framework.  The questions posed by this approach “risk androcentric or sexist bias” and commonly proposed rules about just interventions “remain gendered in concealed ways” (Cudd, 360, 363).   One challenge is to explore the ethics of care as an alternative approach (for example, Held).  More direct challenges to the just war framework consider whether threshold conditions, such as genocide or crimes against humanity, incorporate rape and sexual atrocities that victimize women in particular (for example, Card), or whether oppression of women satisfies “just cause” requirements for using military force (Cudd, 369-370).  Another concern is that proportionality requirements include among the effects of interventions consequences for enhancing or diminishing “women’s rights and power” and for the relational autonomy of individuals as that concept has been developed in other feminist work in ethics (Cudd, 366).  The suggestions often include a call for more attention to non-military, preventive action to address human rights issues, including traditional gender roles and hierarchies.

f. Pacifism

A final consideration is another source of challenge to humanitarian interventions: pacifism. Just war theory’s attempt to delineate some wars as morally justified is between political realism’s denial that morality applies to state behavior and pacifism’s rejection of all war, killing, or violence by states.  Among the many varieties of pacifism, relevant to questions about humanitarian interventions is absolute anti-war pacifism, and, in this context, what is often called “just war pacifism.”  Using typical just war requirements – jus ad bellum and jus in bello – it is argued that no war has or even can satisfy all jus bellum standards, including, it would follow, wars fought for humanitarian purposes.  In effect, just war pacifism opposes all wars by applying rigorously and strictly all the standard requirements for a war to be justified.

Arguments for just war pacifism typically focus on a few jus bellum requirements: proportionality considerations, in bello discrimination as providing immunity for non-combatants, and the idea of war only as a last resort.  Calling attention to the undeniable destructive consequences of war and use of military force, just war pacifists deny that the benefits do or can sufficiently outweigh the costs.  Proportionality requirements are interpreted and applied in ways that they are not or can never be satisfied, even by uses of military force for humanitarian purposes.  The argument depends on complex causal estimations and calculations about which certainty or reliability is dubious. Just war pacifism sees macro-proportionality as capable of much more justificatory work than it is accorded by many just war theorists.  The argument is more effectively employed with respect to micro-proportionality and the in bello discrimination principle.  Warring parties cannot avoid what is euphemistically called “collateral damage” — the death of non-combatants and and destruction of non-military property – despite the features of contemporary warfare, with its “smart bombs,” drones, and technological targeting controls.  Just war pacifism rightly attends to this feature.  The pacificists’ argue that even with modern technology, levels of collateral damage remain too high to be morally justifiable.  Morally acceptable standards are not and cannot be satisfied; thus, even if all ad bellum standards are met, no war is a just war.  The difficulty with the argument is establishing precise levels of morally acceptable death and destruction for non-combatants, whether seen as unintended consequences or not.  Of course, if no “collateral damage” is morally permissible, then it would seem that no war, no humanitarian intervention, could be a truly just war.  Finally, just war pacifism demands that war be a last resort and argues that always there are or can be non-military alternatives.  These arguments typically turn on how to construe the last resort requirement.  As mentioned above, a literal reading of the idea excludes most wars or interventions as unjust; and an expansive, counterfactual construal of the requirement makes no wars just, but tends to conflate advocacy for better preventive infrastructure and strategies with justifying responses to developing events.  Just war pacifism, like any absolute, unconditional opposition to war and use of military force, must somehow negotiate a troubling moral path whereby innocent persons will not be rescued because of a superior principle prohibiting the use of armed force, even for humanitarian purposes to stop widespread, systematic human suffering.

5. References and Further Reading

  • Bass, Gary J. Freedom’s Battle: The Origins of Humanitarian Intervention. New York: Random House, 2008.
    • An easily readable rendition of modern cases of interventions in order to show that “all of the major themes of today’s heated debates about humanitarian intervention … were voiced throughout the nineteenth century.”
  • Buchanan, Allen.  Justice, Legitimacy, and Self-Determination: Moral Foundations for International Law.  Oxford: Oxford University Press, 2004.
    • A significant contribution to a number of issues and discussions, albeit challenging in its sophistication and conclusions rooted in a Kantian approach to moral theory.
  • Buchanan, Allen, and David Golove.  “Philosophy of International Law,” in The Oxford Handbook of Jurisprudence & Philosophy of Law. Ed. Jules Coleman and Scott Shapiro.  Oxford: Oxford University Press, 2002. 868-934.
    • A defense and description of normative philosophy of international law, including attention to political realism, legal positivism, transnational distributive justice,  human rights, secession, and humanitarian intervention (but not including just war theory).
  • Card, Claudia.  “The Paradox of Genocidal Rape Aimed at Forced Pregnancy.” The Southern Journal of Philosophy 46 (2008): 176-189.
  • Chatterjee, Dee K., and Don E. Scheid, eds.  Ethics and Foreign Intervention. Cambridge: Cambridge University Press, 2003.
    • A collection of contributions to conceptual and normative issue of humanitarian intervention, the merits and limits of the “just war” approach, law and secession, and critiques of interventionism.  Especially recommended are the contributions by essays by Hoffmann, Brown, Lucas, and Coady.
  • Cudd, Ann E.  “Truly humanitarian intervention: considering just causes and methods in a feminist cosmopolitan frame.”  Journal of Global Ethics 9 (2013): 359-375.
  • Fletcher, George P., and Jens D. Ohlin.  Defending Humanity:  When Force is Justified and Why.  New York: Oxford University Press, 2008.
  • Govier, Trudy. “War’s Aftermath: The Challenge of Reconciliation.” War: Essays in Political Philosophy. Ed. Larry May.  Cambridge: Cambridge University Press, 2008. 229-248.
  • Gregory, Derek. The Colonial Present: Afghanistan, Palestine, Iraq. Wiley-Blackwell, 2004.
  • Held, Virginia.  “Military Intervention and the Ethics of Care.” The Southern Journal of Philosophy 46 (2008): 1-20.
  • Hoffman, Stanley.  The Ethics and Politics of Humanitarian Intervention.  Notre Dame: University of Notre Dame Press, 1997.
  • Holder, Cindy. “Responding to Humanitarian Crises.”  War: Essays in Political Philosophy. Ed. Larry May.  Cambridge: Cambridge University Press, 2008. 85-104.
  • Holzgrefe, J. L., and Robert O. Keohane, eds. Humanitarian Intervention: Ethical, Legal, and Political Dilemmas.  Cambridge: Cambridge University Press, 2003.
    • A collection of contributions, including an excellent survey of philosophic issues in the humanitarian intervention debate by Holzgrefe.  Other contributors address issues of international law, global ethics, and state sovereignty.
  • International Commission on Intervention and State Sovereignty (ICISS).  The Responsibility to Protect: Report of the International Commission, and Supplementary Volume to the Report.  Ottawa: International Development  Research Centre, 2001.
    • The Report is a pithy summary of major issues, of a defense of interventions in terms of “just war” principles, and with attention to institutional and legalities related to the UN. The supplementary volume includes experts’ background essays on history and major issues (for example, “State Sovereignty,” “Prevention”), presentations of numerous cases, and extensive bibliographies organized by facets of the debates and controversies about humanitarian interventions.
  • Johnson, James Turner.  Morality and Contemporary Warfare. New Haven: Yale University Press, 1999.
    • A historically informed approach to the just war tradition and the theory’s suitability for today’s world.  Chapter 3 is devoted to “the question of intervention.”
  • Jokic, Alexander, ed.  Humanitarian Intervention: Moral and Philosophical Issues.  Toronto: Broadview Press, 2003.
    • A collection of conference papers; especially recommended are contributions by Ellis, Wilkins, Pogge, and Buchanan.
  • Lang, Anthony F., ed.  Just Intervention.  Washington, D.C.: Georgetown University Press, 2003.
    • Especially relevant are contributions by Nardin, Chesterman, Weiss, and Cook.
  • Lee, Steven P. Ethics and War: An Introduction.  Cambridge: Cambridge University Press, 2012.
    • A comprehensive, sophisticated introduction to “just war” theory which includes advocating a “human rights paradigm” to address interventions and questions of state sovereignty.
  • Lucas, Jr., George R.  Perspectives on Humanitarian Military Intervention.  Berkeley: University of California Press, 2001.
  • Nardin, Terry, and Melissa S. Williams, eds.  Humanitarian Intervention.  NOMOS XLVII. New York: New York University Press, 2006.
  • Orend, Brian.  The Morality of War. Second edition.  Toronto: Broadview Press, 2013.
    • Written by one of the major contributors to contemporary just war theory, including extensive attention to jus post bellum issues.
  • Orford, Anne.  Reading Humanitarian Intervention.  Cambridge: Cambridge University Press, 2003.
  • Rawls, John.  The Law of Peoples. Cambridge: Harvard University Press, 1999.
    • This work is prominent in discussions of a host of issues in global ethics and international law.  An extension of his landmark social contract argument in A Theory of Justice (1971), in the context of a contractarian theory of international society and law, this work briefly addresses human rights, just wars, and interventions.
  • Smith, Michael.  “Humanitarian Intervention: An Overview of the Ethical Issues.”  Ethics and International Affairs 12 (1998): 63-79.
  • Spivak, Gayatri Chakravorty.“Can the Subaltern Speak?” Marxism and the Interpretation of Culture. Ed. C. Nelson and L. Grossberg.  University of Illinoise Press, 1988. 271-313.
  • Teson, Fernando R. Humanitarian Intervention: An Inquiry into Law and Morality. Third edition. Ardsley, NJ: Transnational Publishers, 2005.
    • Written by an international law professor, this volume develops a philosophic and legal defense of interventions from a decidedly liberal, Kantian perspective.
  • Walzer, Michael.  Just and Unjust Wars: A Moral Argument with Historical Examples.  Third edition.  New York: Basic Books, 1977, 2000.
    • Now in a fourth edition, this volume has become the classic, early 21st century discussion of “Just War” theory in its entirety. Chapter 6 is devoted to interventions and the Preface to the Third Edition succinctly outlines major issues for morally justifying humanitarian interventions.
  • Walzer, Michael.  Arguing about War. New Haven: Yale University Press, 2004.
    • A collection of essays addressing developing issues in just war theory, including humanitarian interventions (see especially selection 5, “The Politics of Rescue”).
  • Wheeler, Nicholas J. Saving Strangers: Humanitarian Intervention in International Society.  Oxford: Oxford University Press, 2000.
    • Discussions of the most prominently discussed cases of the last half century, to explore “how different theories of international society lead to different conceptions of the legitimacy of humanitarian intervention.”

 

Author Information

Robert Hoag
Email: Bob_Hoag@berea.edu
Berea College
U. S. A.

Epistemic Consequentialism

Consequentialism is the view that, in some sense, rightness is to be understood in terms of conduciveness to goodness. Much of the philosophical discussion concerning consequentialism has focused on moral rightness or obligation or normativity. But there is plausibly also epistemic rightness, epistemic obligation, and epistemic normativity. Epistemic rightness is often denoted with talk of justification, rationality, or by merely indicating what should be believed. For example, my belief that I have hands is justified, while my belief that I will win the lottery is not; Alice’s total belief state is rational, while Lucy’s is not; we all should be at least as confident in p or q as we are in p. The epistemic consequentialist claims, roughly, that these kinds of facts about epistemic rightness depend solely on facts about the goodness of the consequences. In slogan form, such a view holds that the epistemic good is prior to the epistemic right.

Many epistemologists seem to have sympathy for the basic idea behind epistemic consequentialism, because many epistemologists have been attracted to the idea that epistemic norms that describe appropriate belief-forming behavior ultimately earn their keep by providing us with some means to garner what is often thought to be the epistemic good of accurate beliefs. Consequentialist thinking has also gained popularity among more formally minded epistemologists, who apply the tools of decision theory to argue in consequentialist fashion for various epistemic norms. And there is also a consequentialist strand in certain areas of philosophy of science, especially those areas that attempt to explain how it is that science as a whole might have considerable epistemic success even if individual scientists are acting irrationally. Thus, there is a kind of prima facie plausibility to epistemic consequentialism.

Table of Contents

  1. Consequentialism
  2. Final Value and Veritism
  3. Consequentialist Theories
    1. A Simple Example
    2. Cognitive Decision Theory
    3. Accuracy First
    4. Traditional Epistemology: Justification
      1. Coherentism
      2. Reliabilism
      3. Evidentialism
    5. Traditional Epistemology Not Concerned with Justification
    6. Social Epistemology
    7. Philosophy of Science
      1. Group versus Individual Rationality
      2. Why Gather Evidence?
  4. Summing Up: Some Useful Distinctions
  5. Objections to Epistemic Consequentialism
    1. Epistemic Trade-Offs
    2. Positive Epistemic Duties
    3. Lottery Beliefs
  6. References and Further Reading

1. Consequentialism

There is unfortunately no consensus about what precisely makes a theory a consequentialist theory. Sometimes it is said that the consequentialist understand the right in terms of the good. Somewhat more generally, but still imprecisely, we could say that the consequentialist maintains that normative facts about Xs (for example, facts about the rightness of actions) depend solely on facts about the value of the consequences of Xs. In light of this, some see consequentialism as a reductive thesis: it purports to reduce normative facts (for instance, about what one ought to do) to evaluative facts of a certain sort (for instance, about what is good). Smith (2009) and others, however, mark what is distinctive about consequentialism differently. Some maintain that a consequentialist is committed to understanding what is right or obligatory in terms of what will maximize value (Smart and Williams 1973, Pettit 2000, Portmore 2007). Still others maintain that a consequentialist is one who is committed to only agent-neutral, rather than agent-relative prescriptions (where an example of an agent-relative prescription is one that instructs each person S to ensure that S not lie, whereas an agent-neutral prescription instructs each person S to minimize lying) (McNaughton and Rawling 1991). And finally, some maintain that what is distinctive about consequentialism is the lack of intrinsic constraints on action types (Nozick 1974, Nagel 1986, Kagan 1997).

Perhaps the best way to elucidate consequentialism, then, is to point to paradigm cases of consequentialist theories and attempt to generalize from them. On this score there is some agreement: classic hedonic utilitarianism (of the sort defended by Bentham and Mill) is thought to be a clear instance of a consequentialist theory. That theory maintains that an action is morally right if and only if the total sum of pleasure minus pain that results from that action exceeds the total sum of pleasure minus pain of any alternative to that action. The normative facts here are facts about the moral rightness of actions and the utilitarian claims that these facts depend solely on facts about the moral goodness of the consequences of actions, where moral goodness is measured by summing up total pleasure minus total pain.

Though it is not possible to give an uncontroversial set of necessary and sufficient conditions for a theory being a species of consequentialism, it is useful to see that there is some sort of unity to views, such as hedonic utilitarianism, normally classified as consequentialist. The following three-step “recipe” for a consequentialist theory evinces this unity, and will be useful to refer to later. (A similar recipe is given by Berker 2013a,b.)

Step 1. Final Value: identify what has final value, where something has final value iff it is valuable for its own sake (sometimes the term “intrinsic value” is used in the same way).

Example: For the classic hedonic utilitarian, pleasure is the sole thing of final value and pain is the sole thing of final disvalue; thus, final value is here generalizing the concept of moral goodness above.

Step 2. Ranking: explain how certain things relevant to the normative facts you care about are ranked in virtue of their conduciveness to things with final value.

Example: The normative facts of interest to the classic hedonic utilitarian are facts about the rightness and wrongness of actions, so actions are the relevant things to rank. The classic hedonic utilitarian says that actions can be ordered by calculating for each action the sum of the total final value in the consequences of that action.

Step 3. Normative Facts: explain how the normative facts are determined by facts about the rankings.

Example: The classic hedonic utilitarian says that an action a is right if and only if it is ranked at least as high as of any action that is an alternative to a.

2. Final Value and Veritism

Before looking at specific consequentialist epistemic theories, it is worth saying something about what epistemic consequentialists typically think about the first step in the recipe, which concerns final value. Many who are sympathetic to epistemic consequentialism also adhere to veritism (the term is due to Goldman 1999; Pritchard 2010 calls this view epistemic value t-monism). According to veritism, the only thing of final epistemic value is true belief and the only thing of final epistemic disvalue is false belief. Generalizing somewhat so that the view can capture approaches that think of belief as graded, we can say that according to veritism, the only thing of final epistemic value is accuracy and the only thing of final epistemic disvalue is inaccuracy. Not all epistemic consequentialists are veritists; others have thought that there is more to final epistemic value than mere accuracy, such as the informativeness or interestingness of the propositions believed, or whether the propositions believed are mutually explanatory or coherent. Others have thought that things such as wisdom (Whitcomb 2007), understanding (Kvanvig 2003), or a love of truth (Zagzebski 2003) have final epistemic value.

But even those consequentialists who think that accuracy does not exhaust what is epistemically valuable tend to think that accuracy is an important component of final epistemic value (for an alternative view, see Stich 1993). It is not hard to see why such a view is theoretically attractive. Although all explanations must come to an end somewhere, it seems that veritism, or at least something like it, is in a good position to give satisfying explanations of our epistemic norms. Veritism together with consequentialism can do so by showing how conforming to that norm conduces toward the goal of accuracy. If one could show, say, that by respecting one’s evidence one is likely to hold accurate beliefs, then one has a better explanation for an evidence-respecting norm than does the person who says such a norm is simply a brute epistemic fact.

Questions about final epistemic value are important for would-be epistemic consequentialists. This article notes the different views that epistemic consequentialists have held concerning final epistemic value, but there is little substantive discussion about the advantages and disadvantages of competing views about final epistemic value. That said, the debate concerning the nature of final epistemic value is an important debate for epistemic consequentialists to watch. In particular, the epistemic consequentialist will need a notion of final epistemic value according to which final epistemic value is the sort of thing that it makes sense to promote.

3. Consequentialist Theories

In light of the consequentialist recipe above, a specific epistemic consequentialist theory can be obtained by specifying the bearers of final epistemic value, the principle by which options are then ranked in terms of final epistemic value, and the normative facts that this ranking determines. Below, specific epistemic consequentialist theories are presented in this way.

a. A Simple Example

For illustrative purposes, consider a very simple consequentialist theory. According to this view, the only thing of final epistemic value is true belief. Then, say that a belief is justified to the extent that it garners epistemic value for the believer. This can be put in the consequentialist recipe as follows:

Step 1. Final Value: True beliefs have final epistemic value; false beliefs have final epistemic disvalue.

Step 2. Ranking: The normative facts at issue are facts about whether beliefs are justified, so beliefs are the natural thing to rank. According to this view, S’s belief that p is ranked above S’s belief that q iff the belief that p in itself and in its causal consequences garners more epistemic value for S than the belief that q.

Step 3. Normative Facts: The belief that p is justified iff it is ranked above every alternative to believing p.

One might think that this simple view has a relatively obvious flaw. It seems to imply that every true belief is justified and every false belief unjustified. This is what Maitzen (1995) argues:

If one seeks, above all else, to maximize the number of true (and minimize the number of false) beliefs in one’s (presumably large) stock of beliefs, then adding one more true belief surely counts as serving that goal, while adding a false belief surely counts as disserving it. (p. 870)

As clear as this seems, it is actually mistaken. For although the belief that p (when p is false) will not directly add value to S’s belief state, such a false belief may have an effect on other beliefs that S forms later and so, in total, be preferable to adopting the true belief that ~p. That said, no one has defended such a simple version of epistemic consequentialism. In actual practice, the relationship between final epistemic value and epistemic justifiedness is not proposed to be as direct as this simple view would have it. With that, we turn to examine such views.

b. Cognitive Decision Theory

Suppose that we think that rational agents have degrees of belief that can be represented by probability functions, but we think there are still important all-or-nothing epistemic options that these agents have regarding which propositions they accept as true. Patrick Maher (1993), for instance, argues that even if we think of scientists as having degrees of belief, we still need a theory of acceptance if we want to understand science. Why is this? Maher defines accepting that p as sincerely asserting that p (this is not the only definition of acceptance; van Fraassen (1980), though he is writing primarily about subjective probability, thinks of acceptance as a kind of cognitive commitment; Harman (1986, p. 47) sees acceptance as the same as belief and says that one accepts p when (1) one allows oneself to use p as part of one’s starting point for further reasoning and when (2) one takes the issue whether p to be closed in the sense that one is no longer investigating that issue). Further, Maher maintains that the scientific record tells us about which theories scientists asserted not about what credences scientists had. Thus, a theory of acceptance (in the sense of sincere assertion) is needed to understand science on Maher’s view.

If we think of things roughly in this way, then it is natural to turn to decision theory to determine what propositions agents should accept. Decision theory tells an agent which action it would be rational to perform based on a ranking of each action available to the agent in terms of the action’s expected value. To find the expected value of an action for an agent, one considers each set of consequences the agent thinks is possible given the performance of that action, and then sums up the value of those consequences, weighted by the agent’s degrees of belief that those consequences are realized conditional on that action. An action is then taken to be rational iff no other action is ranked higher than it in terms of expected value. When considering which proposition it would be rational for an agent to accept, it is natural to set things up similarly. Instead of evaluating the usual type of actions, one evaluates acts of acceptance of propositions that are available to the agent. These different acts of acceptance can be ranked in terms of the expected final epistemic value of each act of acceptance.

Such an approach to acceptance is briefly discussed by Hempel (1960). Isaac Levi (1967) presents a more complete theory of this kind. Levi imagines that a scientist has a set of mutually exclusive and jointly exhaustive hypotheses h­­1, h2,…,hn and that the scientist’s options for acts of acceptance are to accept one of the hi or to accept a disjunction of some of them. We suppose that scientists have subjective probability functions, which reflect the evidence that they have gathered with respect to the hypotheses in question. Levi’s basic proposal is that agents should accept some hypothesis (or disjunction of hypotheses) if so-doing maximizes expected final epistemic value where the weight for the expectation is provided by the subjective probability function (this is very similar to, though not identical to, the weighting in terms of degrees of belief mentioned above). What is final epistemic value for Levi (Levi uses the term “epistemic utility”)? According to Levi, final epistemic value has two dimensions that correspond to what the goals of any disinterested researcher ought to be. The first dimension is truth. True answers are valued more than false answers. The second dimension is “relief from agnosticism.” The idea here is that more-informative answers (for example, “X wins”) are valued more than less-informative answers (for example, “X or Y wins”). These values pull in opposite directions. One can easily accept a true proposition if informativeness is ignored as the disjunction “X wins or X does not win” is sure to be true. Similarly, one can easily accept an informative proposition if truth is ignored. Accordingly, Levi defines a family of functions that balance these two dimensions of value. He does not settle on one way of balancing, but instead considers as permissible the whole family functions that balance these two dimensions of value in different ways.

Several features of Levi’s approach are worth noting. First, note that on Levi’s view it can happen that the proposition a scientist should accept is not the one that the scientist sees as most probable, because final epistemic value is a function of both the truth/falsity of the proposition and its informativeness.

The second point worth noting brings us to an important distinction when considering epistemic consequentialism. Levi is interested in the expected final epistemic value of accepting some proposition h1, but where the value of the consequences of accepting h1 include only the value of accepting h1 and not the causal consequences of this acceptance. That is, suppose an agent has the option of accepting h1 or accepting h2. Suppose that h1 is both more likely to be true and more informative than 2. So on any weighting, and on any final epistemic value function, accepting h1 will rank higher than h2 if we ignore the later causal consequences of these acts of acceptance. But suppose that accepting h2 is known to open up opportunities for garnering much more final epistemic value later (perhaps by allowing one to work on a research project only open to those who accept h2). Levi’s theory says that the agent should accept h1, not h2. Thus, it is a form of consequentialism that ignores the causal consequences of the options being evaluated. What matters are not the causal consequences of accepting h1, but rather the expected final value of the acceptance of h­1 itself, ignoring its later causal consequences.

One might argue that this feature of Levi’s view is enough to make it thereby not a form of consequentialism, because it is not faithful to the idea that the total set of causal consequences of an option (for example, an action or a belief or an act of acceptance) is relevant to the normative verdict concerning that option. Be that as it may, there is still a teleological structure to Levi’s view: acts of acceptance inherit their normative properties in virtue of conducing to something with final epistemic value. It is just that “conducing” is construed noncausally, in this case as something more akin to instantiation (Berker (2013a,b) explicitly allows such views to count as instances of epistemic consequentialism or epistemic teleology—he uses both terms). For future reference, I will use the term “restricted consequentialism” to refer to views that are teleological in the sense of Levi’s view, but do not take the total set of causal consequences of an option to be relevant to its normative status. In section 5, this distinction is examined more carefully.

Cognitive decision theory fits into our consequentialist recipe as follows:

Step 1. Final Value: Accepting propositions that are true has final epistemic value, and accepting propositions that are informative has final epistemic value. The total final epistemic value of accepting a proposition is a function of both its truth and its informativeness, though the way that these values are balanced can permissibly differ from agent to agent.

Step 2. Ranking: The act of accepting some answer to a question is ranked according to its subjective expected final epistemic value.

Step 3. Normative Facts: One should accept answer a to question Q iff accepting a is ranked at least as high as every other alternative answer to Q.

For criticism of this approach, see Stalnaker (2002) and Percival (2002).

c. Accuracy First

Cognitive decision theory takes for granted that agents have a certain kind of doxastic state, represented by a probability function, and uses this to tell us about the norms for the different kind of doxastic state of acceptance. But suppose that one does not want to take for granted such an initial doxastic state. Does decision theory have anything to offer such an epistemic consequentialist?

James Joyce (1998) shows that the answer to this question is “yes” if we accept certain assumptions about final epistemic value that many find plausible. Joyce argues that degrees of belief—henceforth, credences—that are not probabilities are accuracy-dominated by credences that are probabilities. A credence function, c, is accuracy-dominated by another, c¢, when in all possible worlds, the accuracy of c¢ is at least as great as the accuracy of c, and in at least one world, the accuracy of c¢ is greater than the accuracy of c (for an introduction to possible worlds, see IEP article Modal Metaphysics). Joyce uses this, plus some assumptions about final epistemic value to establish probabilism, the thesis that rational credences are probabilities.

As Pettigrew (2013c) has noted, the basic Joycean framework requires one to do three things. First, one defines a final epistemic value function (often called an “epistemic utility function”). Second, one selects a decision rule from decision theory. Finally, one proves a mathematical theorem of the sort that says only doxastic states with certain features are permissible given the decision rule and final epistemic value function. Let us consider each of these steps in turn.

The final epistemic value functions that are typically used are different in kind than the functions used in cognitive decision theory. Whereas the final epistemic value functions in cognitive decision theory tend to value both accuracy—that is, truth and falsity—and informativeness, the final epistemic value functions in the Joycean tradition value only accuracy (this is why the moniker “accuracy first” is appropriate). Accuracy can be understood in different ways. There are two main issues here: (1) what counts as perfect accuracy? (2) how does one measure how far away a doxastic state is from perfect accuracy? With respect to (1), Joyce (1998) takes a credence function to be perfectly accurate at a world when the credence function matches the truth-values of propositions in that world (that is, assigns 1s to the truths and 0s to the falsehoods). Many have followed him in this, although there are alternatives (for example, one could think that a credence function is perfectly accurate at a world if it matches the chances at that world rather than the truth-values at that world). With respect to (2), things get more complicated. The appropriate mathematical tool to use to calculate the distance a credence function is from perfect accuracy is a scoring rule, that is, a function that specifies an accuracy score for credence x in a proposition relative to two possibilities: the possibility that the proposition is true and the possibility that it is false. There are many constraints that can be placed on scoring rules, but one popular constraint is that the scoring rule be proper. A scoring rule is proper if and only if the expected accuracy score of a credence of x in a proposition q, where the expectation is weighted by probability function P, is maximized at x = P(q). Putting together a notion of perfect accuracy and a notion of distance to perfect accuracy yields a final epistemic value function that is sensitive solely to accuracy. One proper scoring rule that is often used as a measure of accuracy is the Brier score. Let vw(q) be a function that takes value 1 if proposition q is true at possible world w and that takes value 0 if proposition q is false at possible world w. Thus, vw(q) merely tells us whether proposition q is true or false at possible world w. In addition, let c(q) be the credence assigned to proposition q, and let  be the set of propositions to which our credence function assigns credences. Then the Brier score for that credence function at possible world w is:
ep-con-graph1This will give us an accuracy score for every credence function for any world we please. Suppose, for example, that we are considering two credence functions defined over only the proposition q and its negation:

c1(q) = 0.75                c2(q) = 0.8

c1(~q) = 0.25             c2(~q) = 0.3

There are two possible worlds to consider: the world where q is true and the world where it is false. In the world (call it “w1”) where q is true, the Brier score for each credence function is as follows:

ep-con-graph3-4

As one can verify, c1 scores better than c2 in a world where q is true. Now, consider a world where q is false (call this world “w2”):

ep-con-graph5-6

Again, as one can verify, c1 scores better than c2 in a world where q is false.

Once one has a final epistemic value function, such as the Brier score, one must pick a decision rule. Joyce (1998) uses the decision rule that dominated options are impermissible. In the example immediately above, c1 is dominated by c2 because c1 scores better than or equal to c2 in every possible world. Thus, c2 is an impermissible credence function to have.

Our example considers only two very simple credence function. The final step in Joyce’s program is to prove a mathematical theorem that generalizes the specific thing we saw above. Joyce (1998) proves that for certain choices of accuracy measures, including the Brier score, every incoherent credence function is dominated by some coherent credence function, where a credence function is coherent iff it is a probability function. (Note that in our example, c2 is incoherent while c1 is coherent, thus illustrating an instance of this theorem.) Recall that probabilism is the thesis that rational credence functions are coherent. If we take permissible credence functions to be rational credence functions and if we can prove that no probabilistically coherent function is dominated by some probabilistically incoherent function—something that Joyce (1998) does not prove, but that is proven in Joyce (2009)—then we have a proof of probabilism from some assumptions about final epistemic value and about an appropriate decision rule.

Others have altered or extended this approach in various ways. One alteration of Joyce’s program is to use a different decision rule, for instance, the decision rule according to which permissible options maximize expected final epistemic value. Leitgeb and Pettigrew (2010a,b) use this decision rule to prove that no incoherent credence function maximizes expected utility.

The results can be extended to other norms, too. For instance, conditionalization is a rule about how to update one’s credence function in light of acquiring new information. Suppose that c is an agent’s credence function and ce is the agent’s credence function after learning e and nothing else. Conditionalization maintains that the following should hold:

For all a, and all e, c(a|e) = ce(a), so long as c(e)0.

In this expression, c(a|e) is the conditional probability of a, given e. Greaves and Wallace (2006) prove that, with suitable choices for accuracy measures, the updating rule conditionalization maximizes expected utility in situations where the agent will get some new information from a partition (a simple case of this is where an agent will either learn p or learn ~p). Leitgeb and Pettigrew (2010a,b) give an alternative proof that conditionalization maximizes expected utility.

Joyce is concerned with proving norms for degrees of belief. The approach can be extended to prove norms where all-or-nothing belief states are taken as primitive. Easwaran and Fitelson (2015) extend the approach in this way. Interestingly, their approach yields the result that some logically inconsistent belief states are permissible (for instance, in lottery cases). The approach has also been extended to comparative confidence rankings (where a comparative confidence ranking represents only certain qualitative facts about how confident an agent is in propositions—for instance, that she is more confident in p than in q). Williams (2012) has extended the approach in a different direction by examining cases where the background logic is nonclassical.

Joyce’s (1998) approach fits nicely into the consequentialist recipe (and subsequent work can be made to fit into the recipe, too):

Step 1. Final Value: Credences have final epistemic value in proportion to how accurate they are.

Step 2. Ranking: Credence functions are put into two classes: dominated credence functions and non-dominated credence functions.

Step 3. Normative Facts: A credence function is permissible to hold if and only if it is non-dominated.

In this way, the accuracy-first approach appears to be an especially “pure” version of epistemic consequentialism. The project is to work out what the epistemic norms are for doxastic states given that you care only about the accuracy of those doxastic states.

However, one prominent objection to the accuracy-first approach questions this. To see this, note that the verdicts about which credence functions dominate (or maximize expected epistemic value) are not sensitive to the total causal consequences of adopting a credence function as they only look at the expected epistemic value of that state and not the causal effects of the adoption of that state. There are really two points here. This first point is the same point that was noted with respect to cognitive decision theory: the accuracy-first program seems to be an instance of restricted consequentialism. This can make the view seem to not genuinely be a consequentialist view. Greaves (2013) raises some objections to the program along these lines; the issue she raises is very similar to the kinds of issues that Berker (2013a,b) and Littlejohn (2012) have raised in objections to epistemic consequentialism in traditional epistemology. The general worry is discussed below in section 5a.

The second point concerns a distinction that can be drawn between evaluating a doxastic state and evaluating the adoption of a doxastic state. The accuracy-first program seems to be interested in the former rather than the latter, which can make it seem further still from traditional consequentialism. This issue can be brought out by an example due to Michael Caie (2013). Suppose we are considering what the permissible credence function is with respect to only the propositions q and ~q where q is a self-referential proposition that says “q is assigned less than 0.5 credence.” This is an odd proposition in that if q is assigned less than 0.5 credence, then it is true (and so it would be more accurate to increase one’s credence in q), but if one increases one’s credence in q to 0.5 or greater, then q is false (and so it would be more accurate to decrease one’s credence in q). In such a situation, an incoherent credence function appears to dominate the coherent ones. To see this, note that there are no worlds where c(q) = 1, c(~q) = 0, and where q is true (because if c(q) =1, then q is false) or where c(q) = 0, c(~q) = 1, and where q is false (because if c(q) = 0, then q is true). The best that a coherent credence function can do is to assign c(q) = c(~q) = 0.5. In that case, q is false, and so the Brier score is 1.5. But compare this with the credence function, c*, according to which c*(q) = 0.5 and c*(~q) = 1. In that case, q is again false, and so c*(~q) gets a better score than does c(~q). Overall, c* gets a Brier score of 1.75.

How can this be, if we have proofs that probabilistically coherent credence functions dominate incoherent credence functions? The answer to this is that the proofs by Joyce and others assume a very strong kind of independence between belief states and possible worlds. Even though there is no world where c(q) = 1, c(~q) = 0, and where q is true, Joyce and others still consider such worlds when working out which credence functions dominate or maximize expected epistemic value. With these possible worlds back in play, the incoherent c* is dominated. In particular, for the desired results (that probabilism is true, that conditionalization is the correct updating rule, and so forth) to go through, we must be able to assess how accurate a doxastic state is in a world where that doxastic state could not be held. Further, we must maintain that facts about the accuracy of doxastic states in worlds where they cannot be held are sometimes relevant to our evaluation of a doxastic state in some other world where it is actually held. This might lead one to question whether this accuracy-first approach really is a form of epistemic consequentialism (though that is of course complicated by the fact that there is no consensus about what it takes to be a consequentialist theory) and indeed whether the evaluative framework can be motivated.

d. Traditional Epistemology: Justification

i. Coherentism

According to coherentism about justification, a belief is justified if and only if it belongs to a coherent system of beliefs (note that the term “coherent” here refers to some informal notion of coherence, perhaps related to, but distinct from, the notion of coherent credences). This on its own does not commit coherentists to any sort of epistemic consequentialism. However, some of the debates and claims made within the coherentist literature suggest that some prominent coherentists are committed to some form of epistemic consequentialism. For instance, in The Structure of Empirical Knowledge, BonJour (1985) defends a version of coherentism about justification. In this work, BonJour devotes an entire chapter to giving an argument for the following thesis:

A system of beliefs which (a) remains coherent (and stable) over the long run and (b) continues to satisfy the Observation Requirement is likely, to a degree which is proportional to the degree of coherence (and stability) and the longness of the run, to correspond closely to independent reality. (p. 171)

BonJour is thus attempting to show that the degree of coherence of a set of beliefs is proportional to the likelihood that those beliefs are true. He calls this a metajustification for his coherence theory of justification. And why is such a metajustification required? He writes:

The basic role of justification is that of a means to truth, a more directly attainable mediating link between our subjective starting point and our objective goal. […] If epistemic justification were not conducive to truth in this way, if finding epistemically justified beliefs did not substantially increase the likelihood of finding true ones, then epistemic justification would be irrelevant to our main cognitive goal and of dubious worth. […] Epistemic justification is therefore in the final analysis only an instrumental value, not an intrinsic one. (pp. 7–8)

This strongly suggests that BonJour thinks of the epistemic right—justification—in consequentialist terms (Berker (2013a) claims that BonJour (1985) should be understood in this way). If justification understood as coherence is not conducive to truth, then justification understood as coherence is not valuable. This suggests the following picture:

Step 1. Final Value: True beliefs have final epistemic value; false beliefs have final epistemic disvalue.

Step 2. Ranking: Sets of beliefs are ranked in terms of their degree of coherence where this degree of coherence is proportional to the likelihood that the set of beliefs is true.

Step 3. Normative Facts: A belief is justified iff it belongs to a set of beliefs that is coherent above some threshold.

The claim in Step 2, that coherence is truth-conducive, has been addressed explicitly in the literature, starting with Klein and Warfield (1994). They argue that the fact that one set of propositions is more coherent than another set does not entail that the conjunction of the propositions in the first set is more likely to be true than the conjunction of propositions in the second set. The basic argument is that a set of propositions (say, the set including a and b) can sometimes be made more coherent by adding an additional proposition to it (to yield the set including a, b, and c). However, the conjunction (a and b and c) is never more probable than the conjunction (a and b). Bovens and Hartmann (2003) and Olsson (2005) add to this literature and each prove results to the effect that no matter one’s measure of coherence, there will be cases where one set is more coherent than another, but its propositions are less likely. (For one response to these arguments, see Huemer (2011); Angere (2007) considers whether these arguments undermine BonJour’s coherentism.)

In light of difficulties establishing that coherence is truth-conducive, it is open to coherence theorists to not go down the consequentialist route. Such a coherentist might maintain that beliefs that are members of coherent sets are epistemically right independent of whether such sets are likely to be true. This mimics the non-consequentialist Kantian who maintains that certain actions are right independent of the final value that taking these actions leads to.

ii. Reliabilism

Reliabilism about justification, as championed by Alvin Goldman (1979), maintains that beliefs are justified when they are produced by suitably reliable processes. Put another way, beliefs are justified when produced by the right kinds of processes, and the right kinds of processes are those that are truth-conducive. One helpful way to think about the consequentialist structure of reliabilism is to think of it as analogous to rule utilitarianism. According to the rule utilitarian, we evaluate moral rules for rightness directly in terms of the consequences of their widespread acceptance. Actions are then evaluated in terms of whether or not they conform to a right rule. Similarly, according to reliabilism, the things up for direct consequentialist evaluation are not acts of acceptance or particular beliefs that could be adopted. Rather, processes of belief formation are evaluated consequentially. Reliabilists tend to see true belief as the sole thing of final epistemic value. Processes are thus evaluated based on their truth-ratios, the ratio of true beliefs produced to total beliefs produced. However, unlike a maximizing theory, reliabilism maintains that a process is acceptable just in case it has a truth-ratio above some absolute threshold. It is thus different from maximizing theories in two ways. First, a process can be acceptable even if it is not the most reliable process and thus not the optimally truth-conducive process. Second, a process need not be acceptable even if it is the most reliable process, because the reliabilist requires that processes meet some minimum threshold to be acceptable.

We can put a simple version of reliabilism about justification into our consequentialist recipe:

Step 1. Final Value: True beliefs have final epistemic value; false beliefs have final epistemic disvalue.

Step 2. Ranking: Processes are put into two classes: acceptable and not acceptable. If the process has a reliability score at or above the threshold, the process is acceptable; otherwise, it is not acceptable. The reliability score of a process p at world w is given by the sum of the true beliefs that process p produces at w divided by the sum of the total beliefs that process p produces at w (that is, the truth-ratio of p at w).

Step 3. Normative Facts: A belief is justified for S at t at w iff S’s belief at t at w is produced by an appropriate belief-forming process at w.

There are subtle ways in which reliabilism can differ from what the recipe above suggests. One of the most notable differences concerns Goldman’s (1986) approach. Although Goldman (1979) gives a theory that looks very much like what is represented above, in Goldman (1986) it is not individual processes that are ranked at Step 2, but rather systems of rules about which processes may and may not be used. A system of rules is then acceptable if and only if a believer who follows those rules has an overall truth-ratio above a certain threshold. Thus, the analogy to rule utilitarianism is even stronger in Goldman (1986) than in Goldman (1979), something which he explicitly notes. There has also been some dispute among reliabilists about the exact way that processes should be scored for their reliability (and so the exact form of Step 2), but despite that, the view looks to be committed to some form of consequentialism.

iii. Evidentialism

One of the main rivals of reliabilism about justification is evidentialism, initially defended by Richard Feldman and Earl Conee (1985) (whether evidentialism is a rival of coherentism depends subtly on exactly how the views are spelled out). Evidentialism maintains that the belief that p is justified for an agent at time t iff p is supported by the agent’s total evidence at t. Conee (1992) motivates the total evidence requirement with reference to an overriding goal of true belief, in which case evidentialists agree with reliabilists and with BonJour-style coherentists that justification is a matter of truth conduciveness. Feldman (2000) motivates the total evidence requirement with reference to an overriding goal of reasonable belief (rather than true belief), in which case evidentialists disagree with reliabilists and BonJour-style consequentialists about the nature of final epistemic value, but agree that justification should be spelled out in consequentialist terms. More recently, Conee and Feldman (2008) have suggested that what has final epistemic value is coherence. Whether this view is committed to consequentialism depends on how the details are spelled out. If the idea is that a doxastic state is justified in proportion to how much it promotes the value of coherence, whether in itself or in its causal consequences, then such a view is plausibly committed to consequentialism, with the good of coherence substituted for the good of true belief. However, there may be other ways of interpreting their view according to which it looks less committed to consequentialism.

It should be noted that Feldman (1998) makes clear that the only thing relevant to whether one should believe p is one’s evidence now concerning p’s truth. The causal consequences of believing p are explicitly ruled out by Feldman as relevant to that belief’s justificatory status. So if Feldman is to count as a consequentialist, it is of a very restricted sort. Presumably, Feldman holds something similar in Conee and Feldman (2008). Conee (1992), on the other hand, has expressed more sympathy with the idea that we should sometimes sacrifice epistemic value now for more epistemic value later. Thus, there is perhaps a stronger case that Conee’s version of evidentialism is also some form of consequentialism.

e. Traditional Epistemology Not Concerned with Justification

Stephen Stich (1990) offers a method of epistemic evaluation not concerned with justification, but that is committed to consequentialism. According to Stich, there are no special epistemic values (such as true belief), there are just things that people happen to value. Reasoning processes and reasoning strategies are seen as one tool that we use to get what we value. Stich (1993, p. 24) writes: “One system of cognitive mechanism is preferable to another if, in using it, we are more likely to achieve those things that we intrinsically value.” Thus, we have cognitive mechanisms being ranked in terms of their consequences, but where the consequences that matter are not uniquely epistemic, but rather anything that we happen to intrinsically value.

Richard Foley’s (1987) The Theory of Epistemic Rationality is not directed at analyzing justification. Nevertheless, it provides another example of work in traditional epistemology that seems to be committed to some form of epistemic consequentialism. Foley identifies our epistemic goal as that of now believing those propositions that are true and not now believing those propositions that are false. It is then epistemically rational for a person to believe a proposition whenever on careful reflection that person has reason to believe that believing that proposition will promote his or her epistemic goals, provided that all else is equal. Foley is clear, however, that he does not intend his view to sanction as rational adopting a belief that one is now confident is false in order to garner more true beliefs later. Thus, like some of the other views canvassed here, Foley adopts something like a consequentialist framework for evaluating beliefs, but in a restricted way, where the causal consequences of beliefs are not relevant to the normative verdicts of those beliefs.

Though a large focus of Goldman (1986) is to give a reliabilist account of justification, he notes that there are other important ways that processes, and thus that beliefs produced by those processes, can be evaluated. In particular, Goldman considers evaluating processes for their speed and for their power. The speed of a process concerns how quickly a process issues true beliefs. The power of a process concerns how much information a process gives to you. A highly reliable process might have very little speed if it takes a very long time to issue a belief. And the same highly reliable process might have very little power if it produces only that one belief. Goldman suggests that we can use a consequentialist-style analysis to evaluate processes in these ways, too.

Bishop and Trout (2005) argue against the practice of so-called standard analytic epistemology, which includes many of the approaches to justification looked at above. Bishop and Trout propose a view according to which we evaluate reasoning strategies by drawing on empirical work in psychology, rather than by consulting our intuitions. According to Bishop and Trout, the three factors that affect the quality of a reasoning strategy are: (1) whether the strategy is reliable across a wide range of problems, (2) the ease with which the strategy is used, and (3) the significance of the problems toward which the reasoning strategy can be used. They emphasize that whether a set of reasoning strategies is an excellent one to use depends on a cost/benefit analysis. It is natural, then, to think of their normative verdicts about whether a reasoning strategy is excellent as depending on the consequences of using that strategy along dimensions (1)–(3).

In this section and in the one before, we have seen that some traditional epistemologists with otherwise diverse views about justification or epistemic evaluation more generally seem to be committed, at bottom, to a kind of epistemic consequentialism. The aforementioned theories do not merely identify some bearer of final epistemic value, but also define one designator of epistemic rightness (for example, justification, rationality, epistemic excellence) in terms of such value.

f. Social Epistemology

Social epistemology is concerned with the way that social institutions, practices, and interactions are related to our epistemic endeavors, such as knowledge generation. Several prominent approaches within social epistemology also seem to be committed to some form of epistemic consequentialism.

Alvin Goldman’s (1999) Knowledge in a Social World is a nice example of social epistemology done with explicit commitments to consequentialism. Goldman writes:

People have interest, both intrinsic and extrinsic, in acquiring knowledge (true belief) and avoiding error. It therefore makes sense to have a discipline that evaluates intellectual practices by their causal contributions to knowledge or error. This is how I conceive of epistemology: as a discipline that evaluates practices along truth-linked (veritistic) dimensions. Social epistemology evaluates specifically social practices along these dimensions. (p. 69)

Goldman’s general approach is to adopt a question-answering model. According to this approach, beliefs in propositions have value or disvalue when those propositions are answers to questions that interest the agent. This suggests that Goldman promotes a view according to which final epistemic value is accuracy with respect to questions of interest, and not mere accuracy alone. As Goldman conceives of it, the epistemic value of believing a true answer to a question of interest is 1, the epistemic value of withholding belief to a true answer is 0.5, and the epistemic value of rejecting a true answer is 0. Goldman extends this to degrees of belief in that natural way: the epistemic value of having a degree of belief x in a true proposition is x. (It is worth noting that this corresponds to a scoring rule that is improper, compare section 3c.) We can then evaluate social practices instrumentally, in terms of their causal contributions to belief states that have final epistemic value. Goldman does this by first specifying the appropriate range of applications for a practice. This will involve actual and possible applications (because some practices do not have an actual track record). Second, one takes the average performance of the practice across these applications. The average performance of a practice determines how it is ranked compared to its competitors. Thus, on this view, it is something like objective expected epistemic value that ranks the various practices.

Consider an example. Goldman argues that civil-law systems are better, from an epistemic perspective, than are common-law systems. The argument for this is complex, but the general structure follows the framework described above. Goldman considers various differences between the two systems, including the numerous exclusionary evidentiary rules in the common-law system as compared to the civil-law system, the large role that adversarial lawyers play in the common-law system as compared to the civil-law system, and the fact that the civil-law system employs trained judges as decision-makers rather than lay jurors. With respect to each of these differences, one can approximate the epistemic value for the relevant decision-makers under each system. For instance, one can estimate how many correct verdicts compared to incorrect verdicts jurors would reach if there were exclusionary evidentiary rules compared to if there were not. On balance, Goldman argues, the civil-law system performs better. For another evaluation of legal structures in consequentialist terms, see Laudan (2006).

Goldman (1999) directs this same style of consequentialist argument toward a variety of social practices, including testimony, argumentation, Internet communication, speech regulation, scientific conventions, law, voting, and education.

Note, however, an important shift in the consequentialist view Goldman defends here compared to earlier theories considered. Previously, the things being evaluated have been belief states or acts of acceptance. Here, Goldman is evaluating social practices and methodologies. We could call the approach in Goldman (1999) an instance of methodological epistemic consequentialism, whereas the former theories are instances of doxastic epistemic consequentialism (note that this terminology is not standard and is introduced simply for clarity within this article).

The basic view can be put into our recipe as follows:

Step 1. Final Value: Accurate beliefs of S in answer to questions that interest S have final epistemic value.

Step 2. Ranking: Social practices are ranked according to the average amount of final epistemic value that they produce across the range of situations they can be applied to.

Step 3. Normative Facts: Social practice A is epistemically better than social practice B just in case A and B are alternatives to each other and A is ranked higher than B in Step 2.

For criticism of Goldman’s social epistemology that focuses specifically on its consequentialist commitments, see DePaul (2004). See also Fallis (2000, 2006).

g. Philosophy of Science

Though Goldman’s work in social epistemology touches on aspects of science, more generally his focus is on social practices. Others are interested in similar questions about social practices, structures, and conventions, but specifically with respect to science. In some of this work, there is a clear foundation of something like epistemic consequentialism.

i. Group versus Individual Rationality

Philip Kitcher (1990) is one of the first to apply formal models to social structures in science to determine the optimal structure for a group of researchers to achieve their scientific goals. The guiding idea behind his work is that if everyone were rational, then they would each make decisions about which projects to explore based on what the evidence supports and there would be a uniformity of practices among scientists. This uniformity would be bad, however, because it would prevent people from pursuing research on new up-and-coming theories (for example, continental drift in the 1920s) as well as on older outgoing theories (for example, phlogiston theory in the 1780s). Kitcher defines two notions: X’s personal epistemic intentions are what X wishes to achieve himself and X’s impersonal epistemic intentions are what X wishes his community to achieve. The question at hand can then be put: how would scientists rationally decide to coordinate their efforts if their decisions were dominated by their impersonal epistemic intentions?

Kitcher formalizes this situation by supposing that there are N researchers working on a particular research question, and each has to determine which research program she will pursue. Define a return function, Pi(n), which represents the chance that program i will be successful given that n researchers are pursuing it. Suppose that each researcher’s personal epistemic intention is to successfully answer the research question. In that case, each researcher will adopt whichever program i has the largest value for Pi(ni), where ni is the number of researchers currently pursuing i. However, if we suppose that each researcher’s impersonal epistemic intention is that someone in the community of researchers successfully answers the question, then this way of choosing research programs may not be the way to realize the impersonal epistemic intention. Consider a simple example where there are two research programs, 1 and 2, and N researchers. The best way to achieve the group goal is to maximize P1(n) + P2(Nn). But this could be a different distribution than the one that would result were each researcher to be guided by her personal epistemic intention. To see this suppose that there are j researchers in program 1 and k researchers in program 2. It could be that P1(j+1) > P2(k+1) and so a new researcher would choose program 1. But for all that, it could be that P1(j+1) – P1(j) < P2(k+1) – P2(k). That is, the boost in probability of success that program 2 gets from the addition of one more researcher is greater than that of program 1. In that case, it is better for the group for a new researcher to join program 2. Kitcher goes on to argue that certain intuitively unscientific goals such as the goal of fame or popularity could help motivate researchers into a division of labor that helps to reach the impersonal goals rather than the personal goals of each researcher.

Kitcher does not claim that there is one objective answer to what the appropriate epistemic intentions or values are. Nevertheless, there is a consequentialist structure to his argument. Groups of scientists are seen as rational when they choose among options in such a way that they maximize their chance of attaining their epistemic goals. One could question whether this is enough to make the view count as a version of epistemic consequentialism. After all, the options that the agents in Kitcher’s model are choosing between are not beliefs or belief states, but instead decisions about which research program to pursue or about which experiment to run. In this way, Kitcher’s view looks to be an instance of methodological epistemic consequentialism as opposed to doxastic epistemic consequentialism: it is aimed at evaluating actions that are in some way closely related to epistemic ends, rather than at evaluating belief states themselves. Some have argued that approaches such as these do not actually address properly epistemic questions at all. For some thoughts on this, see Christensen (2004, 2007).

Others have followed the general argumentative structure of Kitcher (1990). Zollman (2007, 2010) and Mayo-Wilson, Zollman, and Danks (2011) have focused on the communication networks that might exist between scientists working on the same project. This work reveals some surprising conclusions, in particular, that it might sometimes be epistemically beneficial for the community of scientists to have less than full communication among the members. The basic reason for this is that limiting communication is one way to encourage diversity in research programs, which for Kitcher-like reasons can help the community do better than it otherwise would. Muldoon and Weisberg (2009) and Muldoon (2013) have focused on the kinds of research strategies that individual scientists might have, modeling scientific research as a hill-climbing problem in the computer science literature. They show how it can sometimes be beneficial for the group of scientists to have individuals who are more radical in their exploration strategies.

So far we have surveyed formal models in the philosophy of science literature that seem to take a consequentialist approach to epistemic evaluation. One of the main results of this work is to show how strategies that would be irrational if followed in isolation might yield rational group behavior. Others have emphasized something like this point, but without formal models. Miriam Solomon (1992), for instance, argues for a similar conclusion by drawing on work in psychology and considering the historical data about the shift in geology to accept continental drift. She argues that certain seeming psychological foibles of individual geologists, including cognitive bias and belief preservation, played an important role in the discovery of plate tectonics. Paradoxically, she argues, these attributes that are normally seen as rational failings were in fact conducive to scientific success because they made possible the distribution of research effort. That her work employs a kind of consequentialist picture is evidenced by the fact that she views the central normative question in the philosophy of science to be: “whether or not, and where and where not, our methods are conducive to scientific success…Scientific rationality is thus viewed instrumentally.” (p. 443)

Larry Laudan is another philosopher of science who adopts a generally consequentialist outlook. For Laudan (1984), the things we are ultimately evaluating are methodological rules. Writes Laudan:

… a little reflection makes clear that methodological rules possess what force they have because they are believed to be instruments or means for achieving the aims of science. More generally, both in science and elsewhere, we adopt the procedural and evaluative rules we do because we hold them to be optimal techniques for realizing our cognitive goals or utilities. (1984, p. 26)

There is, on Laudan’s view, not one set of acceptable cognitive goals, although there are ways to rationally challenge the cognitive goals that someone holds. This can be done by either showing that the goals are unrealizable or showing that the goals do not reflect the communal practices that we endorse. On Laudan’s view, then, what has final epistemic value is the realizing of the cognitive goals that we have, so long as these goals are not ruled out in one of the ways above. We can then rank methodological rules, or groups of methodological rules, in virtue of how well they reach those cognitive goals that we have. We then evaluate those rules as rational or not in virtue of this ranking. Laudan does not say that the methodological rules must be optimal, but does suggest, as the quote above notes, that we must think that they are.

ii. Why Gather Evidence?

Another area of philosophy of science that seems committed to epistemic consequentialism concerns the initially odd-sounding question: why should a scientist gather more evidence? On its face, the answer to this question is obvious. But if we idealize scientists as perfectly rational agents, some models of rationality make the question more pressing. For instance, consider an austere version of the Bayesian account of epistemic rationality according to which one is epistemically rational if and only if one’s degrees of belief are probabilistically coherent and one updates one’s beliefs via conditionalization upon receipt of any evidence. An agent can do this perfectly well without ever gathering new evidence. In addition, notice that there is a risk associated with gathering new evidence. Although in the best-case scenario, one acquires information that moves one closer to the truth, it is of course possible that one gets misleading evidence and so is pushed further from the truth. Is there anything that can be said in defense of the intuitive verdict that despite this, it is still rational to gather evidence?

An early answer to this question is provided by I. J. Good (1967). Suppose that you are going to have to make a decision and you can perform an experiment first and then make the decision or you can simply make the decision. Good shows that if you choose by maximizing subjective expected value, if there is no cost of performing the experiment, and if several other constraints are imposed, then the subjective expected value of your choice is always at least as great after performing the experiment as before. Here then we have an argument in favor of a certain sort of epistemic behavior—gathering evidence—that is consequentialist at heart. It says that if you do this sort of thing, you can expect to make better choices. However, it is not clear that this is an epistemic consequentialist argument. At best, it suggests that experimenting is pragmatically rational. To drive this point home, note that it seems there are experiments that are epistemically rational to perform even if there is no reason to expect that any decision we will make depends on the outcome.

Others, however, have attempted to extend the basic Good result to scenarios where only final epistemic value is at issue. Oddie (1997), for instance, shows that if one uses a proper scoring rule to measure accuracy and if one updates via conditionalization, then the expected final epistemic value of learning information from a partition is always at least as great as refusing to learn the information. Myrvold (2012) generalizes this basic result and shows that something similar holds even if we do not require that one updates via conditionalization. Instead, so long as one satisfies Bas van Fraassen’s (1984) reflection principle, then something similar to Oddie’s result holds. For commentary on van Fraassen’s reflection principle, see Maher (1992). For other work on the issue of gathering evidence, see Maher (1990) and Fallis (2007).

Work in this area seems clearly committed to an especially veritistic form of epistemic consequentialism. Here we have an argument in favor of acquiring new evidence (if it is available) that appeals solely to the increase in accuracy one can expect to get from such evidence. As Oddie (1997, p. 537) writes: “The idea that a cognitive state has a value which is completely independent of where the truth lies is just bizarre. Truth is the aim of inquiry.”

4. Summing Up: Some Useful Distinctions

Now that we have surveyed a variety of theories that seem to have some commitment to epistemic consequentialism, it is useful to remind ourselves of two important distinctions relevant to categorizing different species of epistemic consequentialism.

First, some of the theories discussed above are committed to restricted consequentialism. According to these views, the normative facts about Xs are determined by some restricted set of the consequences of the Xs. More precisely, consider a theory that will issue normative verdicts about some belief b. A restricted consequentialist view maintains that something has final epistemic value, but that the normative facts about b are not determined by the amount of final epistemic value contained in the entire set of b’s causal consequences. In the limit, none of the causal consequences of b are relevant; only the final epistemic value contained in b itself is relevant. For instance, Feldman’s view about justification, Foley’s view about rationality, the approach of cognitive decision theory, and some versions of the accuracy-first program appear to be restricted consequentialist views in this limiting sense. Feldman, recall, explicitly states that the causal consequences of adopting a belief are irrelevant to its justificatory status; Foley focuses on the goal of now believing the truth and not now believing falsely, so excludes causal consequences; and Joyce’s accuracy-first program looks at whether some doxastic state dominates another doxastic state when the states are looked at for their accuracy now. Reliabilism is arguably also a form of restricted consequentialism, because the causal consequences of the belief itself are not relevant to its normative status; rather, it is the status of the particular process of belief formation that led to the belief that is relevant to the belief’s normative status. A process of belief formation earns its status, in turn, in terms of the proportion of true beliefs that it directly produces, so not even the total consequences of a belief-forming process are relevant according to the reliabilist.

Unrestricted consequentialist views, on the other hand, are those according to which the normative facts about whatever is being evaluated are determined by the amount of final epistemic value in the entire set of that thing’s causal consequences. It is unclear whether we have seen any wholly unrestricted consequentialist views in this sense, although Goldman’s approach to social epistemology and Kitcher’s approach to the distribution of cognitive labor may come close.

It is something of an open question whether a restricted consequentialism is genuinely a form of consequentialism. Some discussions of consequentialism in ethics suggest that restricted versions of consequentialism are not genuinely instances of consequentialism (see, for instance, Pettit (1988), Portmore (2007), Smith (2009), and Brown (2011)). Klausen (2009) argues that restricted versions of consequentialism are not genuinely instances of consequentialism, specifically with respect to epistemology.

The second important distinction to keep in mind when categorizing species of epistemic consequentialism is a distinction between those theories that seek to evaluate belief states and those that seek to evaluate some sort of action of some epistemic relevance. An example will make this distinction clearer. The accuracy-first program seeks to evaluate belief states based solely on their accuracy. Kitcher’s approach to the distribution of cognitive labor seeks to evaluate the decisions of scientists to engage in certain lines of research based on the ultimate payoff in terms of true belief for the scientific community. As noted above, we could call the first approach an instance of doxastic epistemic consequentialism and the second sort of approach an instance of methodological epistemic consequentialism (again, note that these terms are not established in the literature). With this distinction in hand, we can sort some of the theories above along this dimension. Attempts to explain why it is rational to gather evidence, much of social epistemology, and the work on communication structures and exploration strategies among scientists are instances of methodological epistemic consequentialism. Consequentialist analyses of justification, cognitive decision theory, and the accuracy-first program are instances of doxastic epistemic consequentialism.

5. Objections to Epistemic Consequentialism

Theories committed to some form of epistemic consequentialism will have specific objections that can be lodged against them. Here we will focus on general objections to the fundamental idea behind epistemic consequentialism.

a. Epistemic Trade-Offs

Epistemic consequentialists maintain that, in some way, the right option is one that is conducive to whatever has final epistemic value. Say that you accept a trade-off if you sacrifice something of value for even more of what is valuable. Thus, if true belief has final epistemic value (and if each true belief has equal final epistemic value), you accept a trade-off when you sacrifice a true belief concerning p for two true beliefs about q and r. It is hard to see how one can hold a consequentialist view and not think that it is at least sometimes permissible to accept trade-offs. For then it would seem that rightness is no longer being understood in terms of conduciveness to what has value (though, as we will see, restricted consequentialists of a certain sort may be able to deny this).

The permissibility of accepting trade-offs, however, constitutes a problem for epistemic consequentialism. If one thinks about consequentialist theories in ethics, this is not so surprising. Some of the strongest intuitive objections to consequentialist moral theories are those that focus on trade-offs. Consider, for instance, the organ harvest counterexample to utilitarianism (Thomson 1985). In that scenario, a doctor has five patients all in dire need of a different organ transplant. The doctor also has a healthy patient who is a potential donor for each of the five patients. Because it is a consequentialist moral theory and endorses trade-offs, it seems that utilitarianism says the doctor is required to sacrifice the one to save the five. But, it is alleged, this flies in the face of common sense, and so we have a challenge for utilitarianism.

Trade-off objections to epistemic consequentialism (structurally similar to the organ harvest) have been made explicitly by Firth (1981), Jenkins (2007), Littlejohn (2012), Berker(2013a,b), and Greaves (2013). And one can see hints of such an objection in Fitelson and Easwaran (2012) and Caie (2013).

The basic objection starts with the observation that a belief can be justified or rational or epistemically appropriate (or whatever other term for epistemic rightness one prefers) even if adopting that belief causes some epistemic catastrophe. Similarly, it seems that a belief can be unjustified or irrational or epistemically inappropriate even if adopting that belief results causally in some epistemic reward. For an example of the first sort, S might have significant evidence that he is an excellent judge of character and so S believing this about himself might be justified for S. But it could be that this belief serves to make S overconfident in other areas of his life and so S ends up misreading evidence quite badly in the long run. For an example of the second sort, S might have no evidence that God exists, but believe it anyway to make it more likely that S receives a large grant from a religiously affiliated (and unscrupulous) funding agency. The grant will allow S to believe many more true and interesting propositions than otherwise (the example is due to Fumerton (1995), p. 12). These kinds of examples seem to show that epistemic rightness cannot be understood in terms of conduciveness to what has epistemic final value.

There are two main responses that the epistemic consequentialist can make to the trade-off objection, and each comes with a challenge. The first response is to maintain that, appearances to the contrary, there are versions of epistemic consequentialism that do not sanction unintuitive trade-offs. For a response in this vein, see Ahlstrom-Vij and Dunn (2014). In ethics, some who think of themselves as consequentialists respond to analogous objections by introducing agent-relative values (see, for instance, Sen (1982) and Broome (1995)). The basic idea is that we can have agent-relative values in the outcomes of states, which allows, for example, for agent S to value the state where S breaks no promises more than someone else values that same state. This allows for one to give a consequentialist-based evaluation of rightness that does not always require one to say that it is right for S to break a promise in order to ensure that two others do not break their promises. It is not clear how such a modification of consequentialism would best carry over to epistemic consequentialism, but it could represent a way of making this first response. The challenge for any response in this vein is to explain how such views are genuinely an instance of epistemic consequentialism.

The second response to trade-off objections is to maintain that while epistemic consequentialism does sanction trade-offs, we can explain away the felt unintuitiveness of such verdicts. The challenge for this second response is to actually give such an explanation.

b. Positive Epistemic Duties

When it comes to moral obligation, it seems plausible that we sometimes have obligations to take certain actions and sometimes have obligations to refrain from certain actions. It is then natural to distinguish between positive duties—say, the obligation to take care of my children—and negative duties—say, the obligation to not steal from others. Consider how a similar distinction would be drawn in epistemology. Obligations to believe certain propositions would correspond to positive epistemic duties, while obligations to refrain from believing certain propositions would correspond to negative epistemic duties.

Littlejohn (2012) has argued that certain forms of epistemic consequentialism look as though they will naturally lead to positive epistemic duties. Suppose, as certain doxastic epistemic consequentialists will maintain, that whether we are obligated to believe or refrain from believing a proposition is a function of the final epistemic value of believing or refraining from believing that proposition. And suppose that the consequentialist also maintains that we have some negative epistemic duties; that is, there are situations where one is epistemically obligated to refrain from believing a proposition. The consequences of refraining in such a situation will have some level of epistemic value. But it seems that we can surely find a situation where believing a proposition has consequences with equal epistemic value. Thus, it looks as though the consequentialist is committed to saying that there are positive epistemic duties: sometimes we are obligated to believe propositions.

However, some epistemologists hold that we have no positive epistemic duties. We may be obligated to refrain from believing certain things, but we have no duties to believe. Nelson (2010) provides one argument for this claim. He argues that if we had positive epistemic duties, we would have to believe each proposition that our evidence supported. But this means we would be epistemically obligated to believe infinitely many propositions, as Nelson argues that any bit of evidence supports infinitely many propositions. As we cannot believe infinitely many propositions, Nelson holds that we have no positive epistemic duties.

The thesis that there are no positive epistemic duties is controversial, as is Nelson’s argument for that claim. Nevertheless, this presents a potential worry for certain versions of epistemic consequentialism. It is perhaps worth noting that this sort of objection to epistemic consequentialism is in some ways analogous to objections that maintain that consequentialist views in ethics are overly demanding. For more on the issue of positive epistemic duties, see Stapleford (2013) and the discussion in Littlejohn (2012, ch. 2).

c. Lottery Beliefs

Suppose that you know there is a lottery with 10,000 tickets, each with an equal chance of winning, but where only one ticket will win. Consider the proposition that ticket 1437 will lose. It is incredibly likely that this proposition is true, and the same is true for each of the n propositions that say that ticket n will lose. Nevertheless, a number of epistemologists maintain that one is not justified in believing such lottery propositions (for instance, BonJour (1980), Pollock (1995), Evnine (1999), Nelkin (2000), Adler (2005), Douven (2006), Kvanvig (2009), Nagel (2011), Littlejohn (2012), Smithies (2012), McKinnon (2013), and Locke (2014)).

Some consequentialist approaches to justification, however, look as though they will say that one is justified in believing such lottery propositions. For instance, suppose that there is a process of belief formation that issues beliefs of the form ticket n is a loser. This process is highly reliable and so beliefs produced by it are justified according to one version of reliabilism about justification. Some process reliabilists about justification might maintain that there is no such process in an attempt to avoid this implication of their view. However, as Selim Berker (2013b) has noted, the very structure of consequentialist views in epistemology looks as though there will be some case that can be brought against the consequentialist where some set of beliefs are justified purely in virtue of statistical information about the relative lack of falsehoods in a set of propositions.

Again, not all maintain that there is no justification to be had in such cases; some maintain that while such lottery propositions cannot be known, they nevertheless can be justified. But there are a number of epistemologists who maintain such a view and so we again have a potential worry here for the consequentialist. For a response to this worry, see Ahlstrom-Vij and Dunn (2014).

6. References and Further Reading

  • Adler, J. (2005) ‘Reliabilist Justification (or Knowledge) as a Good Truth-Ratio’ Pacific Philosophical Quarterly 86: 445–458.
  • Ahlstrom-Vij, K. and Dunn, J. (2014) ‘A Defence of Epistemic Consequentialism’ Philosophical Quarterly 64: 541–551.
  • Angere, S. (2007) ‘The Defeasible Nature of Coherentist Justification’ Synthese 157: 321–335.
  • Berker, S. (2013a) ‘Epistemic Teleology and the Separateness of Propositions’ The Philosophical Review 122: 337–393.
  • Berker, S. (2013b) ‘The Rejection of Epistemic Consequentialism’ Philosophical Issues 23: 363–387.
  • Bishop, M. and Trout, J. D. (2005) Epistemology and the Psychology of Human Judgment. Oxford: Oxford University Press.
  • BonJour, L. (1980) ‘Externalist Theories of Empirical Knowledge’ Midwest Studies in Philosophy 5: 53–74.
  • BonJour, L. (1985) The Structure of Empirical Knowledge. Cambridge, MA: Harvard University Press.
  • Bovens, L., and Hartmann, S. (2003) Bayesian Epistemology. Oxford: Oxford University Press.
  • Broome, J. (1991) Weighing Goods: Equality, Uncertainty and Time. Oxford: Wiley-Blackwell.
  • Brown, C. (2011) ‘Consequentialize This’ Ethics 121: 749–771.
  • Caie, M. (2013) ‘Rational Probabilistic Incoherence’ Philosophical Review 122: 527–575.
  • Christensen, D. (2004) Putting Logic in Its Place. Oxford: Oxford University Press.
  • Christensen, D. (2007) ‘Epistemology of Disagreement: The Good News’ Philosophical Review 116: 187–217.
  • Conee, E. (1992) ‘The Truth Connection’ Philosophy and Phenomenological Research 52: 657–669.
  • Conee, E. and Feldman, R. (2008) ‘Evidence’ In Q. Smith (Ed.), Epistemology: New Essays. Oxford: Oxford University Press: 83–104.
  • DePaul, M. (2004) ‘Truth Consequentialism, Withholding and Proportioning Belief to the Evidence’ Philosophical Issues 14: 91–112.
  • Douglas, H. (2000) ‘Inductive Risk and Values in Science’ Philosophy of Science 67: 559–579.
  • Douglas, H. (2009) Science, Policy, and the Value-Free Ideal. Pittsburgh, PA: University of Pittsburgh Press.
  • Douven, I. (2006) ‘Assertion, Knowledge, and Rational Credibility’ Philosophical Review 115: 449–485.
  • Easwaran, K. and Fitelson, B. (2012) ‘An “Evidentialist” Worry about Joyce’s Argument for Probabilism’ Dialectica 66: 425–433.
  • Easwaran, K. and Fitelson, B. (2015) ‘Accuracy, Coherence, and Evidence’ In T. Szabo Gendler and J. Hawthorne (Eds.), Oxford Studies in Epistemology, Volume 5. Oxford: Oxford University Press.
  • Evnine, S. (1999) ‘Believing Conjunctions’ Synthese 118: 201–227.
  • Fallis, D. (2000) ‘Veritistic Social Epistemology and Information Science’ Social Epistemology 14: 305–316.
  • Fallis, D. (2006) ‘Epistemic Value Theory and Social Epistemology’ Episteme 2: 177–188.
  • Fallis, D. (2007) ‘Attitudes Toward Epistemic Risk and the Value of Experiments’ Studia Logica 86: 215–246.
  • Feldman, R. (1998) ‘Epistemic Obligations’ Philosophy Perspectives 2: 236–256.
  • Feldman, R. (2000) ‘The Ethics of Belief’ Philosophy and Phenomenological Research 60: 667–695.
  • Feldman, R. and Conee, E. (1985) ‘Evidentialism’ Philosophical Studies 48: 15–34.
  • Firth, R. (1981) ‘Epistemic Merit, Intrinsic and Instrumental’ Proceedings and Addresses of the American Philosophical Association 55: 5–23.
  • Foley, R. (1987) The Theory of Epistemic Rationality. Cambridge, MA: Harvard University Press.
  • Fumerton, R. (1995) Metaepistemology and Skepticism. Lanham, MD: Rowman & Littlefield.
  • Goldman, A. (1979) ‘What Is Justified Belief?’ In G. Pappas (Ed.), Justification and Knowledge. Springer: 1–23.
  • Goldman, A. (1986) Epistemology and Cognition. Cambridge, MA: Harvard University Press.
  • Goldman, A. (1999) Knowledge in a Social World. Oxford: Oxford University Press.
  • Good, I. J. (1967) ‘On the Principle of Total Evidence’ British Journal for the Philosophy of Science 17: 319–321.
  • Greaves, H. (2013) ‘Epistemic Decision Theory’ Mind 122: 915–952.
  • Greaves, H. and Wallace, D. (2006) ‘Justifying Conditionalization: Conditionalization Maximizes Expected Epistemic Utility’ Mind 115: 607–632.
  • Haddock, A., Millar, A., and Pritchard, D. (2009) Epistemic Value (Eds) Oxford: Oxford University Press.
  • Harman, G. (1988) Change in View. Cambridge, MA: MIT Press.
  • Hempel, C. (1960) ‘Inductive Inconsistencies.’ Synthese 12: 439–469.
  • Huemer, M. (2011) ‘Does Probability Theory Refute Coherentism?’ Journal of Philosophy 108: 35–54.
  • Jenkins, C. S. (2007) ‘Entitlement and Rationality’ Synthese 157: 25–45.
  • Joyce, J. (1998) ‘A Nonpragmatic Vindication of Probabilism.’ Philosophy of Science 65: 575–603.
  • Joyce, J. (2009) ‘Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief’ In Huber and Schmidt-Petri (Eds.) Degrees of Belief. Springer: 263–300.
  • Kagan, S. (1997) Normative Ethics. Boulder, CO: Westview Press.
  • Klausen, S. H. (2009) ‘Two Notions of Epistemic Normativity’ Theoria 75: 161–178.
  • Klein, P. and Warfield, T. A. (1994) ‘What Price Coherence?’ Analysis 54: 129–132.
  • Kitcher, P. (1990) ‘The Division of Cognitive Labor’ The Journal of Philosophy 87: 5–22.
  • Kvanvig, J. (2003) The Value of Knowledge and the Pursuit of Understanding. Cambridge: Cambridge University Press.
  • Kvanvig, J. (2009) ‘Assertion, Knowledge and Lotteries’ In Greenough and Pritchard (Eds.), Williamson on Knowledge. Oxford: Oxford University Press: 140–160.
  • Laudan, L. (1984) Science and Values. Berkeley: University of California Press.
  • Laudan, L. (2006) Truth, Error, and Criminal Law. Cambridge: Cambridge University Press.
  • Leitgeb, H. and Pettigrew, R. (2010a) ‘An Objective Justification of Bayesianism I: Measuring Inaccuracy’ Philosophy of Science 77: 201–235.
  • Leitgeb, H. and Pettigrew, R. (2010b) ‘An Objective Justification of Bayesianism II: The Consequences of Minimizing Inaccuracy’ Philosophy of Science 77: 236–272.
  • Levi, I. (1967) Gambling with Truth. Cambridge, MA: MIT Press.
  • Littlejohn, C. (2012) Justification and the Truth Connection. Cambridge: Cambridge University Press.
  • Locke, D. T. (2014) ‘The Decision-Theoretic Lockean Thesis’ Inquiry 57: 28–54.
  • Maher, P. (1990) ‘Why Scientists Gather Evidence’ British Journal for the Philosophy of Science 41: 103–119.
  • Maher, P. (1992) ‘Diachronic Rationality’ Philosophy of Science 59: 120–141.
  • Maher, P. (1993) Betting on Theories. Cambridge: Cambridge University Press.
  • Maitzen, S. (1995) ‘Our Errant Epistemic Aim’ Philosophy and Phenomenological Research 55: 869–876.
  • Mayo-Wilson, C., Zollman, K. J., and Danks, D. (2011) ‘The Independence Thesis: When Individual and Social Epistemology Diverge’ Philosophy of Science 78: 653–677.
  • McKinnon, R. (2013) ‘Lotteries, Knowledge, and Irrelevant Alternatives’ Dialogue 52: 523–549.
  • McNaughton, D. and Rawling, P. (1991) ‘Agent-Relativity and the Doing-Happening Distinction’ Philosophical Studies 63: 163–185.
  • Muldoon, R. (2013) ‘Diversity and the Division of Cognitive Labor’ Philosophy Compass 8: 117–125.
  • Muldoon, R. and Weisberg, M. (2009) ‘Epistemic Landscapes and the Division of Cognitive Labor’ Philosophy of Science 76: 225–252.
  • Myrvold, W. (2012) ‘Epistemic Values and the Value of Learning’ Synthese 187: 547–568.
  • Nagel, J. (2011) ‘The Psychological Basis of the Harman-Vogel Paradox’ Philosophers’ Imprint 11: 1–28.
  • Nagel, T. (1986) The View from Nowhere. Oxford: Oxford University Press.
  • Nelkin, D. K. (2000) ‘The Lottery Paradox, Knowledge, and Rationality’ Philosophical Review 109: 373–409.
  • Nelson, M. (2010) ‘We Have No Positive Epistemic Duties’ Mind 119: 83–102.
  • Oddie, G. (1997) ‘Conditionalization, Cogency, and Cognitive Value’ British Journal for the Philosophy of Science 48: 533–541.
  • Nozick, R. (1974) Anarchy, State, and Utopia. New York: Basic Books.
  • Olsson, E. J. (2005) Against Coherence: Truth, Probability, and Justification. Oxford: Oxford University Press.
  • Percival, P. (2002) ‘Epistemic Consequentialism’ Proceedings of the Aristotelian Society Supplementary Volume 76: 121–151.
  • Pettigrew, R. (2012) ‘Accuracy, Chance, and the Principal Principle’ Philosophical Review 121: 241–275.
  • Pettigrew, R. (2013a) ‘A New Epistemic Utility Argument for the Principal Principle’ Episteme 10: 19–35.
  • Pettigrew, R. (2013b) ‘Accuracy and Evidence’ Dialectica 67: 579–596.
  • Pettigrew, R. (2013c) ‘Epistemic Utility and Norms for Credences.’ Philosophy Compass 8: 897–908.
  • Pettigrew, R. (2015) ‘Accuracy and the Belief-Credence Connection’ Philosophers’ Imprint. 15: 1–20.
  • Pettit, P. (2000) ‘Non-consequentialism and Universalizability’ The Philosophical Quarterly 50: 175–190.
  • Pettit, P. (1988) ‘The Consequentialist Can Recognise Rights’ The Philosophical Quarterly 38: 42–55.
  • Pollock, J. (1995) Cognitive Carpentry. Cambridge, MA: MIT Press.
  • Portmore, D. (2007) ‘Consequentializing Moral Theories’ Pacific Philosophical Quarterly 88: 39–73.
  • Pritchard, D., Millar, A., and Haddock, A. (2010) The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press.
  • Sen, A. (1982) ‘Rights and Agency’ Philosophy & Public Affairs 11: 3–39.
  • Smart, J. J. C. and Williams, B. (1973) Utilitarianism: For and Against. Cambridge: Cambridge University Press.
  • Smith, M. (2009) ‘Two Kinds of Consequentialism’ Philosophical Issues 19: 257–272.
  • Smithies, D. (2012) ‘The Normative Role of Knowledge’ Nous 46: 265–288.
  • Solomon, M. (1992) ‘Scientific Rationality and Human Reasoning’ Philosophy of Science 59: 439–455.
  • Stalnaker, R. (2002) ‘Epistemic Consequentialism’ Proceedings of the Aristotelian Society Supplementary Volume 76: 152–168.
  • Stapleford, S. (2013) ‘Imperfect Epistemic Duties and the Justificational Fecundity of Evidence’ Synthese 190: 4065–4075.
  • Stich, S. (1990) The Fragmentation of Reason. Cambridge, MA: MIT Press.
  • Thomson, J. J. (1985) ‘The Trolley Problem’ The Yale Law Journal 94: 1395–1415.
  • van Fraassen, B. (1984) ‘Belief and the Will’ The Journal of Philosophy 81: 235–256.
  • Whitcomb, D. (2007) An Epistemic Value Theory. (Doctoral dissertation) Retrieved from Rutgers University Community Repository at: http://dx.doi.org/doi:10.7282/T3ZP46HD
  • Williams, J. R. G. (2012) ‘Gradational Accuracy and Nonclassical Semantics’ The Review of Symbolic Logic 5: 513–537.
  • Zagzebski, L. (2003) ‘Intellectual Motivation and the Good of Truth’ In Zagzebski, L. and DePaul, M. (Eds.) Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford University Press: 135–154.
  • Zollman, K. J. (2007) ‘The Communication Structure of Epistemic Communities’ Philosophy of Science 74: 574–587.
  • Zollman, K. J. (2010) ‘The Epistemic Benefit of Transient Diversity’ Erkenntnis 72: 17–35.

 

Author Information

Jeffrey Dunn
Email: jeffreydunn@depauw.edu
DePauw University
U. S. A.

Benedict De Spinoza: Moral Philosophy

SpinozaLike many European philosophers in the early modern period, Benedict de Spinoza (1632-1677) developed a moral philosophy that fused the insights of ancient theories of virtue with a modern conception of humans, their place in nature, and their relationship to God. Unlike many other authors in this period, however, Spinoza was strongly opposed to anthropocentrism and had no commitment whatsoever to traditional theological views. His unique metaphysics motivated an intriguing moral philosophy. Spinoza was a moral anti-realist, in that he denied that anything is good or bad independently of human desires and beliefs. He also endorsed a version of ethical egoism, according to which everyone ought to seek their own advantage; and, just as it did for Thomas Hobbes, this in turn led him to develop a version of contractarianism. However, Spinoza’s versions of each of these views, and the way in which he reconciles them with one another, are influenced in fascinating ways by his very unorthodox metaphysical picture.

The topics mentioned so far can be related comfortably to twenty-first century debates in moral philosophy. Yet Spinoza was also very interested in another issue that is moral only in the more archaic sense that it pertains to the good life: namely, the means by which humans may (to some extent) achieve mastery over their passions. Though this topic was of central importance to Spinoza, the pride of place he awarded it in his Ethics reflects the fact that seventeenth-century conceptions of moral philosophy were, in subtle but important ways, different than our own.

Table of Contents

  1. Guiding Metaphysical Principles
    1. Substance Monism
    2. Necessitarianism
    3. The Conatus Doctrine
    4. Activity and Passivity
  2. Moral Philosophy in Spinoza’s System
    1. Spinoza’s Metaethics: Moral Anti-Realism
    2. Spinoza’s Ethics: Ethical Egoism, Contractarianism, and Virtue Theory
      1. The Greatest Good and the Inclination to Morality
      2. Spinoza’s Contractarianism
      3. Spinoza’s Virtue Theory and the “Free Man”
    3. Applications of Spinoza’s Moral Theory
      1. Suicide and Self-Harm
      2. Lying and Deceit
      3. Animal Ethics
      4. Environmental Ethics
  3. Spinoza’s Remedies for the Passions
    1. Via Knowledge of the Affects
    2. Via Removing the Idea of an External Cause
    3. Via the Endurance of Rational Affects
    4. Via the Multiplicity of the Causes of Rational Affects
    5. Via the Re-Ordering of the Affects
  4. Conclusion
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Guiding Metaphysical Principles

The name of Spinoza’s most famous work is the Ethics, but he does not really broach the topic of ethics until part four of the five-part work. The reason for this is that although his aim is to set forth “the right way of living” (E4app, G II/266) and to explain “what freedom of mind, or blessedness, is” (E5pref, G II/277), his accounts of these things depend upon certain key metaphysical principles that he feels must be established first.

This article provides only brief explanations of the relevant principles. For more detailed discussions of each of them, see the main article on Spinoza.

a. Substance Monism

In Cartesian philosophy, a substance is something that does not depend for its existence on anything else—or, in the case of created substances, anything other than God (CSM I, 210). A mode is something that is not a substance (for instance, a property, quality, or attribute). Descartes appears to take the human body and mind to be paradigmatic substances, and the extended properties and thoughts of the body and mind (respectively) to be paradigmatic modes. Spinoza was critical of Descartes for giving a non-univocal definition of the term ‘substance,’ so that the predicate means something different when applied to God than when applied to a human. Spinoza’s alternative approach was to stick to the most general definition: a substance is something that is “in itself and is conceived through itself, that is, that whose concept does not require the concept of another thing, from which it must be formed” (E1d3).

In defining a substance this way, Spinoza avoids the equivocation involved in the Cartesian conception of substances. However, he also quickly concludes that given this definition, humans are not substances. Indeed, Spinoza argues, there can be only one substance, God (E1p14), and everything else is merely a mode of God (E1p15). As a result, Spinoza conceives of God as a being that is absolute and perfect by its very nature; humans, by contrast, are dependent and imperfect by their very nature.

b. Necessitarianism

Although ordinarily we speak as though things could have been different than they in fact are—you could have turned left rather than right, the election might have gone differently, and so on—Spinoza denies that these alternative scenarios are genuinely possible. He provides several different arguments for this conclusion, but perhaps the simplest is based upon the thought that, since the world is a mode of God, and God could not be different than it is, it follows that “Things could have been produced by God in no other way, and in no other order than they have been produced” (E1p33). This divine necessitarianism trickles down: humans, too, could not have acted otherwise than they did. The fact that we ordinarily believe ourselves capable of acting otherwise is an illusion produced by our ignorance of both the physical and psychological forces influencing us, as well as of our own nature (E3p2s).

c. The Conatus Doctrine

Perhaps the most important metaphysical principle involved in Spinoza’s ethical theory is his view that “Each thing, as far as it can by its own power, strives to persevere in its being” (E3p6). The interpretation of this principle is the source of much scholarly disagreement, but a few things are clear. The striving [conatus] at issue is not to be confused with conscious effort, since Spinoza takes the principle to govern bodies as well as minds. Nor is the conatus to be confused with the metabolic processes of a living organism, since Spinoza takes the principle to govern (what we ordinarily consider to be) non-living things as well as living ones. Spinoza is making the metaphysical claim that each thing is possessed of an inner force, by which it continuously reasserts its own existence.

This doctrine is particularly important for understanding Spinoza’s moral theory, since Spinoza accepts psychological egoism on the basis of it: “When this striving is related only to the mind, it is called will; but when it is related to the mind and body together, it is called appetite. This appetite…is the very essence of man, from whose nature there necessarily follow those things that promote his preservation” (E3p9s).

d. Activity and Passivity

In transitioning from his metaphysics to his moral theory, Spinoza relies heavily upon two concepts, activity and passivity, that come to take the place of traditional axiological concepts like good and evil. Something is active insofar as it produces various effects through its striving; conversely, it is passive insofar as it and its states are produced by external causes (E3d1–3). Both activity and passivity are treated as matters of degree. Thus God, the total cause of all things, is active in the highest degree and not at all passive, while humans (since they are not substances) are always partly active and partly passive, causally dependent upon God as well as upon other modes.

With respect to the human mind, activity takes the form of rational or adequate cognition (E3p1). Actions of the mind are adequate ideas, which increase its power of acting, while passions of the mind are inadequate, confused ideas, which decrease its power of acting. Spinoza’s conception of passions is quite general, so, for example, what we would call a “dispassionate” state of melancholy could for him qualify as a powerful passion because of how much it diminishes our activity. This should be borne in mind when we turn, in section 3, to considering Spinoza’s account of how to overcome our passions.

2. Moral Philosophy in Spinoza’s System

a. Spinoza’s Metaethics: Moral Anti-Realism

Spinoza’s metaphysical views quickly commit him to a version of moral anti-realism. A moral realist holds that at least some things are good or bad independently of what we desire or believe to be the case. Spinoza, in numerous passages in the Ethics and earlier works, denies that there are any such moral qualities. His rejection of moral realism is tied up with his rejection of teleological explanations of nature, for he sees the attribution of qualities like goodness or perfection as an error that is based upon the false belief that nature was designed by God with humanity in mind. Spinoza explains, “After men persuaded themselves that everything which happens, happens on their account, they had to judge that what is most important in each thing is what is most useful to them… Hence, they had to form these notions, by which they explained natural things: good, evil, order, confusion, warm, cold, beauty, ugliness” (E1app, G II/82). This family of concepts, which includes moral and aesthetic concepts along with concepts of sensible qualities, Spinoza holds to be produced by the imagination rather than reason. Hence the concepts “by which ordinary people are accustomed to explain Nature…do not indicate the nature of anything, only the constitution of the imagination” (E1app, G II/83).

In addition to providing etiological accounts intended to explain why people make the mistake of treating moral qualities as objective (and thereby to undermine the belief that they are objective), Spinoza develops two distinct arguments for his anti-realism. His first argument for anti-realism is that if moral qualities like evil or imperfection were objective, then it would be conceivable “that Nature sometimes fails or sins, and produces imperfect things” (E4pref, G II/207). But this is inconceivable: such a possibility supposes that there is a goal or standard that nature has fallen short of, yet there is no such goal or standard: “The reason why…God, or Nature, acts and the reason why it exists, are one and the same. As it exists for the sake of no end, it also acts for the sake of no end” (ibid). Again, just as in his earlier discussion, Spinoza’s denial of the objectivity of moral qualities is based upon his rejection of natural teleology. The rejection of natural teleology, in turn, is based upon his substance monism and necessitarianism: “all things follow from the necessity of the divine nature, and hence…whatever seems immoral, dreadful, unjust, and dishonorable, arises from the fact that [we conceive] the things themselves in a way which is disordered, mutilated, and confused” (E4p73s).

It is worth mentioning a second argument that comes shortly after, but appears to have very different motivations: “As far as good and evil are concerned, they also indicate nothing positive in things, considered in themselves… For one and the same thing can, at the same time, be good, and bad, and also indifferent. For example, music is good for one who is melancholy, bad for one who is mourning, and neither good nor bad to one who is deaf” (E4pref, G II/208). If moral qualities were objective, then nothing could have contrary moral qualities at one and the same time. But many things do have contrary moral qualities at one and the same time, with respect to different observers. Therefore, moral qualities are not objective, in the sense that they “indicate nothing positive in things, considered in themselves” (ibid). This argument is quite different than the previous one. The first argument draws out the a priori incoherence that would be involved in the very idea of objective moral qualities, while the second is based upon the empirical premise that different people may judge a thing to have contrary moral qualities. It is an ancestor of the argument from disagreement often used to defend moral relativism.

In spite of the fact that Spinoza rejects moral realism, he does not advocate for the elimination of moral language. To see why, consider an advantage that the moral realist seems to have over Spinoza’s anti-realism. The moral realist, as Spinoza sees it, holds that in cases of moral judgment, we first recognize something to be good (for example), and then this results in our forming a desire for that thing. Though Spinoza rejects this account of moral judgment, one of its benefits is that it allows us to distinguish between what is desired and what is genuinely desirable. Since it often happens that a person wants something and later discovers it really to be undesirable — or even wants something in spite of the fact that he knows it to be undesirable — the distinction is an important one to preserve. For example, we want to be able to make sense of the fact that although someone wants to commit suicide, this is not really desirable; the moral realist’s picture gives us a way to do this by distinguishing the (true) claim that this person desires to commit suicide from the (false) claim that it is good/desirable for this person to commit suicide.

Yet Spinoza thinks the moral realist’s story is exactly backwards: “we neither strive for, nor will, neither want, nor desire anything because we judge it to be good; on the contrary, we judge something to be good because we strive for it, will it, want it, and desire it” (E3p9s; cf. 3p39s). He thus subscribes to a desire-satisfaction theory of value: what is ultimately of value is the satisfaction of desire; things become valuable only by virtue of their being desired, or their serving to satisfy some desire. (For more on this, see Youpa [2010, 209, fn. 1], and Lebuffe [2010, 152–9].) So it may seem that Spinoza will have a problem making the distinction between what we think is good and what is genuinely good for us.

Spinoza agrees that we need this distinction, but holds that our judgments about what is genuinely good for us are based upon an “idea of man” we have formed “as a model of human nature” (E4pref, G II/208). To hold on to the distinction between what a person desires and what is genuinely desirable, then, Spinoza wants to preserve our ordinary talk of good and evil, with the caveat that such talk refers only to the relation between ourselves and an idealized model human (Curley [1979, 356–62], Nadler [2006, 215–9], and Hübner [2014, 136–140]). Hence, Spinoza writes, “I shall understand by good what we know certainly is a means by which we may approach nearer and nearer to the model of human nature we set before ourselves. By evil, what we certainly know prevents us from becoming like that model” (ibid). Since the model is an idealization, the judgment that something is good or evil does not involve any commitment to objective, mind-independent qualities of goodness or evilness. Yet having such a model is useful, since it allows us to make judgments about what will be good or bad for us as distinct from what we presently happen to desire.

b. Spinoza’s Ethics: Ethical Egoism, Contractarianism, and Virtue Theory

The previous section established that Spinoza is a moral anti-realist in the sense that he denies that there exist mind-independent moral properties. Nevertheless, on most readings of the Ethics, Spinoza is also an ethical egoist, since he holds that reason “demands that everyone love himself, seek his own advantage…and absolutely, that everyone should strive to preserve his own being as far as he can” (E4p18s; see also TTP Ch. 16, 175). These two views are compatible, however, since Spinoza’s approach to developing his positive moral theory is to reduce normative claims to considerations of self-interest in a manner reminiscent of Hobbes (Curley 1988, 119–124). Perhaps the major difference between the Spinozist and the Hobbist approaches to egoism is that Spinoza provides a metaphysical argument for the view, in contrast to Hobbes’ psychological argument. Specifically, Spinoza bases his ethical egoism upon his conatus doctrine.

Spinoza’s initial argument for the claim that reason demands that everyone seek his own advantage is brief: “Since reason demands nothing contrary to Nature, it demands that everyone…seek his own advantage… This, indeed, is as necessarily true as that the whole is greater than its part” (E4p18s). Breaking the argument down:

  1. Reason demands nothing contrary to Nature.
  2. It is contrary to Nature for someone not to seek his own advantage.
  3. So, reason demands that everyone seek his own advantage.

Both premises hinge upon what is meant by the claim that something is “contrary to Nature.” By this, Spinoza seems to mean something impossible, something that cannot be, by virtue of incompatibility with either the laws of logic or of nature. In this interpretation, premise (1) is Spinoza’s nod to the commonly held principle that ought implies can: you can be morally bound to do only something that you are able to do. More importantly, given this interpretation, the second premise comes out as a conceptual truth grounded in part of the conatus doctrine.

In E3p4, which he references in his argument for egoism, Spinoza argued, “No thing can be destroyed except through an external cause.” He takes this to entail that “Each thing, as far as it can by its own power, strives to persevere in its being” (E3p6). So, in Spinoza’s view, we have a purely metaphysical argument that it would be “contrary to Nature” for someone not to seek his own advantage. It would be contrary to Nature for anything not to seek its own advantage, insofar as it has the power to do so.

The second premise entails psychological egoism, for it entails that each person will seek his own advantage at all times. Spinoza’s argument for ethical egoism in this sense depends upon psychological egoism, and so it may seem reminiscent of Hobbes’ rationale for the similar conclusion that “of the voluntary acts of every man the object is some good to himself” (L I.xiv; p. 82). However, Hobbes reaches this view on the basis of his account of the psychology of voluntary acts: a voluntary act proceeds from the will, and a person’s will is just the last appetite that strikes him after a process of deliberation (L I.vi; p. 33). Since “whatsoever is the object of any man’s appetite…he for his part calleth good” (L I.vi; p. 28), Hobbes would agree with Spinoza that each person will seek what he considers to be his own advantage at all times. In spite of the similarity of their conclusions, Spinoza’s argument is grounded in the metaphysics of the conatus doctrine, while Hobbes’ argument is grounded in his psychological theory.

One of the philosophical problems with Spinoza’s version of ethical egoism has to do with whether, and to what extent, Spinoza’s view can really be a moral theory at all. Given the argument for the view, it is unclear how Spinoza can take the dictates of reason to be prescriptive. For example, according to Rutherford (2008), Spinoza treats the dictates of reason as adequate ideas that, when we possess them, cause us to act in ways that are conducive to our actual self-interest. If so, to follow the dictates of reason is just to be caused to behave in certain ways, which sits awkwardly alongside the thought that such dictates are prescriptive in any ordinary sense. This topic is the subject of ongoing scholarly inquiry—responses to the problem have been proposed by Kisner (2011, 118) and Steinberg (2014)—and it is closely related to the issue (flagged at the outset of this article) that Spinoza’s conception of ethics is in many ways quite different from our own.

i. The Greatest Good and the Inclination to Morality

For an egoist, the question as to what is good for an individual is crucial, for the answer to this question will determine what that individual ought, morally, to do. And Spinoza’s conception of the good is stereotypically egoistic: “By good I shall understand what we certainly know to be useful to us” (E4d1).  Likewise, to be virtuous is simply to have and to exercise the power to do what is in our nature, and (as per the conatus doctrine) what is in your nature is to seek your own advantage as far as you are able (E4d8; 4p20). As a result, strength of character is also accounted for in self-interested terms.

Many passages in the Ethics make it appear that Spinoza simply thinks that what is best for each of us is the continuation of our lives. For example, he writes that “No one can desire to be blessed, to act well and to live well, unless at the same time he desires to be, to act, and to live, that is, to actually exist” (E4p21). Hence, the principle of seeking one’s own advantage and preserving one’s being is “the first and only foundation of virtue” (E4p22c), and obeying this principle is the only pursuit that is good for its own sake (E4p25). If this were so, then we might expect Spinozist morality to license all manner of violations of traditional morality in the name of self-preservation and the advancement of our own interests. Surprisingly, although he takes self-interest and self-preservation as the foundations of morality, Spinoza nevertheless holds that “The good which everyone who seeks virtue wants for himself, he also desires for other men” (E4p37). Although virtue is founded in rational self-interest, rational self-interest in turn urges us to desire the good of others.

To see why Spinoza thinks this, we need to understand this “good” that is desired by “everyone who seeks virtue.” The good in question, which is supposed to trump all other goods, is not actually our own lives, but what those lives are best spent in obtaining—the knowledge of God. Spinoza writes, “Knowledge of God is the mind’s greatest good; its greatest virtue is to know God” (E4p28). The argument for this is characteristically metaphysical, and again based upon the conatus doctrine. Spinoza argues that the “striving of the mind…is nothing but understanding,” and “cannot conceive anything to be good for itself except what leads to understanding” (E4p26d). Our innate desire to understand nature is, in his view, the very essence of our minds, and so this drive to understand also characterizes the good for us. Finally, “The greatest thing the mind can understand is God” (E4p28d), since ‘God’ signifies the whole of nature, so it follows that “the mind’s greatest advantage…is knowledge of God” (ibid).

Therefore, in Spinoza’s view, our greatest good is not the sort of thing that is subject to natural scarcity, nor need it be the object of competition. Rather, it is “common to all, and can be enjoyed by all equally” (E4p36). And because, in Spinoza’s view, other humans are more useful to us to the extent that they are rational (E4p35c1), it is entirely to our benefit when others pursue the same good—understanding—that we ourselves seek; for detailed exposition of Spinoza’s argument that it is to our benefit to pursue the good of others, see Della Rocca (2004, 125–8), Kisner (2009), and Grey (2013). This is why Spinoza thinks humans have a rational impetus to act in moral (that is, benevolent) ways toward others from a starting point of pure self-interest: “The desire to do good generated in us by our living according to the guidance of reason, I call morality” (E4p37s1).

ii. Spinoza’s Contractarianism

So far, Spinoza’s moral theory might not appear to be capable of answering the practical questions it is ordinarily hoped such a theory will answer. The conception of the good just outlined is so strikingly focused on human intellectual life that the resulting moral theory may seem far removed from ordinary moral matters. However, Spinoza has a bit more to say about morality beyond his claim that it is constituted by the pursuit of knowledge of God and the desire to do good for others. One important strand of Spinoza’s moral thought is a version of moral contractarianism, the view that we may become normatively bound to behave in certain ways on the basis of agreements or contracts we make when we live in society with others. His version of contractarianism is heavily influenced by Hobbes, from whom Spinoza appears to have drawn a number of key ideas. (This article deals only briefly with those aspects of Spinoza’s contractarianism that bear upon morality; see the article on Spinoza’s Political Philosophy for more information about this topic.)

It might seem surprising that Spinoza thinks humans need to live in society at all. Given that our greatest good is knowledge of God, ought we not all retreat to the mountaintop and spend our time in metaphysical inquiry? Spinoza’s reason for denying this is his pessimistic view of the prospects for humans overcoming all of their passions. Even the wisest philosopher requires assistance from her community in the pursuit of her greatest good. On this point, Spinoza disagrees with Descartes, who holds that “Even those who have the weakest souls could acquire absolute mastery over all their passions” (CSM I, 348). Spinoza’s view, by contrast, is that on account of the force of their passions, people “are often drawn in different directions and are contrary to one another, while they require one another’s aid” (E4p37s2, citations elided), and that these passions can never completely be overcome. Thus even the most wise and temperate among us has reason to enter a social contract. Because of our need for one another’s aid—whether to study philosophy or gain security—we have reason to live together with others in society. And because it is extremely difficult to moderate and restrain people’s worst passions, we cannot enjoy the benefits of civil society without entering a social contract.

With this observation in the background, the argument for moral contractarianism appears in a very abbreviated form in a scholium in the Ethics:

In order, therefore, that men may be able to live harmoniously and be of assistance to one another, it is necessary for them to give up their natural right and to make one another confident that they will do nothing which could harm others… By this law, therefore, society can be maintained, provided it appropriates to itself the right everyone has of avenging himself, and of judging good and evil. (E4p37s2, G II/237–8)

The argument is one commonly associated with classical social contract theories. Because humans are unable to live peacefully with one another so long as they retain their natural right to act as they please, it is in each person’s best interest to give up that right to the state, on the condition that everyone else does the same.

For this reason, Spinoza holds the prima facie surprising view that laws are morally binding on us even in cases in which those laws are not rational. In conflicts between the laws of our society and the dictates of our reason, the laws win out. Likewise, although in the context of his metaphysics, Spinoza treats evil and sin as functions of an individual’s power; when he is writing about such things in the context of civil society, he provides a very different picture. For example, he writes, “[E]veryone is bound to submit to the state. Sin, therefore, is nothing but disobedience…” (E4p37s2, G II/238); “A wrong occurs when a citizen or subject is forced to suffer some injury at the hands of another…contrary to the edict of the sovereign power” (TTP Ch. 16, 179). Why does law figure so prominently in discussions of morality in the context of civil society? In his Theological-Political Treatise, where he develops these ideas at length, Spinoza argues, “it is our duty [tenemur] to carry out all the orders of the sovereign power without exception, even if those orders are quite irrational. For reason bids us carry out even such orders so as to choose the lesser of two evils” (TTP Ch. 16, 177). The argument is that even if we recognize what is required by law to be irrational, it cannot be as irrational as it would be to violate the law, and thereby to become “enemies of the state and to act against reason which urges us to uphold the state with all our might” (ibid).

iii. Spinoza’s Virtue Theory and the “Free Man”

Another way in which Spinoza attempts to make his moral theory easier to put into practice is by providing a virtue theory based on it. Spinoza spends the latter sections of part  of the Ethics developing a virtue theory of a fairly traditional sort, outlining which character traits and behaviors are virtues, and which are vices, in the conception of morality he has developed. He concludes this part of the work with some claims “concerning the free man’s temperament and manner of living,” where the “free man” is understood to be someone who lives wholly according to the guidance of reason. Since the very idea of a human being who lives wholly according to the guidance of reason is apparently contradictory—Spinoza has earlier observed that “man is necessarily always subject to passions” (E4p4c)—the discussion of the free man is not properly understood as describing an attainable goal. However, many scholars (such as Garrett [1990, 229–30] and Nadler [2006, 219]) take this discussion of the free man to be Spinoza’s presentation of the model of human nature he promised in the preface to Ethics 4. If so, then the description of the free man may best be seen as a guiding ideal, a character that ordinary people should aspire to be like, at least insofar as they are able.

Spinoza’s description of the free man’s way of living is based upon his account of virtues: if a character trait is grounded in our reason and our pursuit of understanding, it is a virtue; if it is grounded in our passions or ignorance, it is a vice. These considerations are clearly rooted in his conception of our greatest good (as outlined above). Although Spinoza’s treatment of many of the virtues is in keeping with traditional conceptions of virtue, he often parts ways with these traditional conceptions. For example, his conclusion that tenacity and nobility are virtues is in keeping with tradition. (Why are they virtues? Tenacity, he says, is the character trait corresponding to our rational striving for self-preservation, and nobility is the character trait corresponding to our rational striving for the benefit of others [E3p59s]. So both character traits are grounded in reason, not the passions.) However, Spinoza also argues that humility, repentance, and pity—character traits highly esteemed by traditional religious authorities—are not virtues, for they are “useless” and “do not arise from reason” (E4p50, 53, and 54). In his view, these character traits are not really virtues even if they do occasionally cause us to pursue the good, for they are only accidentally connected to the pursuit of the good. Reason, by contrast, is essentially connected to the pursuit of the good. As a result, anything good that we might be led to do out of pity (for instance), we could just as well have been led to do by reason. Being guided by pity, then, can be no better than being guided by reason. Moreover, pity always involves sadness, a form of disempowerment, so considered in itself, it is evil. Hence being guided by pity is inevitably worse than being guided by reason: “a man who lives according to the dictate of reason strives, as far as he can, not to be touched by pity” (E4p50c).

When Spinoza characterizes the “free man,” someone who lives wholly according to the guidance of reason, we should therefore expect only partial continuity with traditional conceptions of morality and virtuous living. The free man, Spinoza reasons, will pick his battles wisely, showing his virtue both in avoiding danger and in overcoming it (E4p69). He will always act honestly (E4p72). And he will seek to live in society with others rather than in solitude (E4p73). Nevertheless, the free man will graciously decline favors or gifts from those who do not follow the guidance of reason and who are ruled by their emotions (E4p70). Accepting such favors or gifts is liable to be dangerous, for the irrational gift-giver will inevitably value them more highly than the free man; the free man reserves his gratitude for the friendship of other rational people (E4p71), insofar as such friendship aids him in his pursuit of greater understanding. In practice no actual human could live exactly as the free man does, for (as mentioned in part one above) only a substance can be fully rational and active, and humans are not substances. Nevertheless Spinoza’s presentation of these claims suggests that he takes them to be desirable ways of living, because they derive from “strength of character, that is, [from] tenacity and nobility” (E4p73), the primary virtues.

c. Applications of Spinoza’s Moral Theory

In the course of developing his moral theory, Spinoza sometimes applies it in passing to what he recognizes are traditional moral problems. He is often somewhat dismissive of many of these traditional moral problems, and his treatment of them rarely includes the sort of depth they receive in works of applied moral philosophy. However, his responses to such problems are often interesting because, given the demands of other parts of his philosophical system, his proposals are often surprising and idiosyncratic. This article discusses four of them: the moral permissibility of suicide, of lying, and of causing harm to animals or to the environment.

i. Suicide and Self-Harm

One traditional moral problem regards the moral permissibility of self-harm, the ultimate case of which is suicide. Spinoza does not agree with most of the traditional religious reasons for treating suicide as a sin. For example, an explanation of the wrongness of suicide common in the Judeo-Christian religious traditions appeals to one of the Ten Commandments: “Thou Shalt Not Kill.” According to this family of explanations, suicide is a sin because it involves taking a human life, which God has commanded humans not to do. Spinoza takes the conception of God upon which this explanation relies to be false: many imagine “God as a ruler, lawgiver, king, merciful, just and so forth; whereas these are all merely attributes of human nature, and not at all applicable to the divine nature” (TTP Ch. 5, 53). God simply does not issue commandments in the way that a king issues commandments. Given this fact, Spinoza thinks, it makes little sense to try to explain moral claims like “Suicide is a sin” by appeal to such commandments.

Although he disagrees with traditional reasons for taking suicide to be immoral, he nevertheless agrees that suicide is in fact immoral. On this point, Spinoza is very clear: someone who commits suicide is “weak-minded and completely conquered by external causes contrary to their nature” (E4p18s). This conclusion is primarily a result of the conatus doctrine, since that doctrine forces Spinoza to deny that anyone can kill himself, strictly speaking. There must always be external causes that can be assigned to explain suicide or self-harm. But that is merely a descriptive claim; the evaluative claim that it is a “weak-minded” act derives from Spinoza’s ethical egoism. To be virtuous is to strive to preserve one’s being, so suicide is as far from virtue as one can go, in Spinoza’s view.

ii. Lying and Deceit

In his characterization of the “free man” at the end of part  of the Ethics, Spinoza argues that a perfectly rational being “always acts honestly, not deceptively” (E4p72). The argument for this, on the face of it, anticipates Kant’s famous argument for the same conclusion. Spinoza reasons that if a perfectly rational being acted deceptively, he would do so “from the dictate of reason” (because, presumably, that is how a perfectly rational being does anything); but then it would be rational to act in that way, and “men would be better advised to agree only in words, and be contrary to one another in fact” (E4p72d). Spinoza takes this consequence to be absurd, for it is in our interest to bring others into as much agreement with our natures as possible (E4p31c), which living deceitfully would prevent.

One puzzle that this argument raises is the apparent conflict between Spinoza’s claim that a perfectly rational being would always act honestly and his claim that such a being would never do anything that brought about its own destruction. Spinoza does not explicitly attempt to resolve this problem in the Ethics, though commentators have attempted to do so on his behalf in a variety of ways (Garrett 1990, 228–33).

iii. Animal Ethics

As should not be surprising given his ethical egoism, Spinoza is not sympathetic to the thought that we ought to worry ourselves about either our treatment of animals or of the environment. With respect to animals, Spinoza writes, “the law against killing animals is based more on empty superstition and unmanly compassion than sound reason” (E4p37s1). Reason dictates that we seek out the companionship of other humans because they share our nature, and what is good for us is good for them. However, since non-human animals differ in nature from us, reason dictates that we “consider our own advantage, use them at our pleasure, and treat them as is most convenient for us” (ibid). So, in spite of the fact that Spinoza does not view humans as metaphysically privileged—for instance, he disagrees with the Cartesian view that humans, but not other animals, have minds (ibid)—he nevertheless holds that we need not concern ourselves with the welfare of non-human animals. There may be situations in which our own welfare depends upon the welfare of a non-human animal, as when a farmer’s livelihood depends upon the welfare of his stock. But only in such situations will a human have reason to care about the welfare of a non-human. That said, it is not clear that this is the view he ought to have adopted, given his first principles (Grey 2013, 378–382).

iv. Environmental Ethics

With respect to the environment, matters are less clear-cut. Spinoza does acknowledge that humans are by their nature dependent upon their environment:

It is the part of a wise man, I say, to refresh and restore himself in moderation with pleasant food and drink, with scents, with the beauty of green plants, with decoration, music, sports, the theater, and other things of this kind, which anyone can use without injury to another. For the human Body is composed of a great many parts of different natures, which constantly require new and varied nourishment… (E4p45s)

Unfortunately, after this picturesque passage, Spinoza does not go on to consider what our dependence upon our environment might entail with regard to our treatment of it. Much of our concern regarding environmental ethics today is based on our recognition that the environment is not an inexhaustible source of nourishment and wealth; to a seventeenth-century author, this possibility would have seemed bizarre.

That being said, Spinoza’s views about animal ethics can be applied more or less directly to the environment as well. It would be irrational to work to preserve the environment for its own sake, since what is good for the environment is not necessarily good for us. However, insofar as we are concerned for the well-being of ourselves and other humans, and we recognize that well-being to depend upon the environment, it will be rational for us to preserve the environment—not for its sake, but for ours. This thought is at least hinted at in the quoted passage, where Spinoza notes that we are to “refresh and restore” ourselves only using means that “anyone can use without injury to another.” Insofar as the production of our “pleasant food and drink” turns out to cause injury to the environment upon which our neighbors (or we ourselves) depend, the practice would be open to moral criticism.

Some, such as Naess (1977), have gone further than this, arguing that Spinoza’s system provides a hospitable metaphysical background for ecology. However, as Kober (2013, 58–9) notes, one of the consequences of Spinoza’s views is that important conceptual tools of ecology lose their purchase. For example, Spinoza allows no distinction between what is natural and what is artificial. And, more importantly, there is no sense to be made of the designation of certain types of human activities as exploitative of the environment or of animals.

3. Spinoza’s Remedies for the Passions

In the 17th century, moral philosophy was not yet primarily preoccupied with either accounting for the nature and origins of morality or with establishing general principles governing moral obligation—though, as we have seen, Spinoza does develop some views on these topics en route to the final part of the Ethics. Rather, in this period, one of the central aims of moral philosophy was to provide the reader with psychological tools that could be used to cultivate desirable states of being. For this reason, seventeenth-century texts on moral philosophy tend to be more akin to self-help books than to twenty-first century moral philosophy. The first half of Ethics V exemplifies this tendency. There, Spinoza attempts to provide a guide to how to train our minds in order to “bring it about that we are not easily affected with evil affects” (E5p10s).

‘Passion’ [passio] is a technical term for which Spinoza provides a careful definition. He writes, “An affect which is called a passion of the mind is a confused idea, by which the mind affirms of its body, or of some part of it, a greater or lesser force of existing than before, which, when it is given, determines the mind to think of this rather than that” (EIII Gen. Def. of Aff., G II/203–4). This definition connects the passions to his theory of ideas, since all passions are confused ideas. It also connects the passions to the conatus doctrine: the passions represent changes in the body’s “force of existing” [existendi vim], and this force of existing is presumably the same force introduced in his discussion of the innate striving of all things to persevere in existing (see section 1 above).

Spinoza appeals to both of these pieces of theoretical machinery, along with a few interesting additions, when he presents his five remedies for overcoming or restraining the passions. It is worth noting that although the view that we should strive to diminish the strength of our emotions has a very Stoic ring to it, he expressly distances himself from the Stoics. His reason for this is their belief “that the emotions depended absolutely on our will, and that we could absolutely govern them” (E5pref), which Spinoza thinks involves a misunderstanding of the structure and powers of the human mind. This comes out in his remedies for the passions: of the five remedies, only two (the first and fifth) are plausibly activities that we can perform intentionally.

a. Via Knowledge of the Affects

Spinoza claims that whenever we “form a clear and distinct idea” of a passion, it will no longer be a passion (E5p3). Since all passions are confused ideas—indeed, this is a core component of the definition of a passion—the most straightforward way to eliminate a passion is to eliminate the confusion that is the basis for that passion. In Spinoza’s view, the idea of an idea is not really distinct from the idea itself (E2p21s), so the clear and distinct idea we form of a passionate affect is not really distinct from that affect. But, since the clear and distinct idea is not confused, to conceive of it in this way is to eliminate the confusion from the original passion. Once we have eliminated this confusion, “the affect will cease to be a passion” (ibid). This approach to overcoming a passion does not eliminate the affect that constitutes the passion, but merely eliminates that feature of the affect in virtue of which it constituted a passion. The confusion a passionate affect involves is not intrinsic to that affect, in Spinoza’s view, and when that confusion is stripped away, the affect nevertheless remains.

Spinoza does not say much to clarify how this procedure is supposed to work. However, in at least one of Spinoza’s accounts of confusion, to say that an idea is confused is to say that it is partly determined by external causes (E2p29s). Thus, to strip away the confusion from a passion would require one somehow to strip away some of its causes. But that possibility appears to be inconsistent with Spinoza’s conception of causation, according to which an effect must be understood through its causes (Lin [2009, 270]; Bennett [1984, 336]). Scholars remain divided as to whether this difficulty, commonly referred to as the Changing Problem, is surmountable; see Marshall (2012) for some proposed solutions on Spinoza’s behalf.

b. Via Removing the Idea of an External Cause

All inadequate ideas have external causes (E3p1), so all passions are guaranteed to have external causes as well. In some cases, a passion not only has an external cause, but is such that it represents that cause (or purported cause). For example, love is joy accompanied by the idea of an external cause of that joy (E3 Def. of Affs. VI, G II/192). That is, the passion of love is a composite idea, and its parts are (i) joy, and (ii) the representation of something external as producing that joy. In such cases, we can destroy the passion by mentally separating the idea of the external cause that it includes. As Spinoza puts it: “For what constitutes the form of [such passions] is joy, or sadness, accompanied by an external cause… So if this is taken away, the form of love or hate is taken away at the same time. Hence, these affects, and those arising from them, are destroyed” (E5p2d).

c. Via the Endurance of Rational Affects

Spinoza’s third remedy for overcoming the passions is less a method than an observation about a natural consequence of our emotional psychology. One factor that determines the force with which an emotion strikes us is whether we conceive of its cause as present. For instance, Spinoza writes, “An affect whose cause we imagine to be with us in the present is stronger than if we did not imagine it to be with us” (E4p9). Examples of this phenomenon are abundant. Whether snakes are present or absent, Yetta fears them. However, if she thinks snakes are present, that fact serves to fuel her fear; and if she thinks them absent, her fear is greatly diminished. Affects that are produced by ordinary external objects—fear of snakes, love for one’s car, desire for pie, and so forth—all naturally vacillate in force over time based on whether we take their objects to be present or absent.

By contrast, affects “arising from or aroused by reason” (E5p7) have a very different profile. The object of such an affect is “necessarily related to the common properties of things” (E5p7d), which are pervasive features of reality, such as the property of being extended. In Spinoza’s view, “we always regard [such properties] as present,” and we “always imagine [them] in the same way” (ibid). So, such an affect will endure over a longer period of time, and with a more constant degree of force, than affects produced by external things. In the long run, Spinoza thinks, irrational affects will be forced to “accommodate themselves” more and more frequently to the rational affects. In this way, we will naturally tend over time toward rational affects and away from irrational ones. Spinoza’s line of argument here is thus aimed at defending the consoling thought that reason will tend to win out, rather than at providing a technique we can apply to help reason win out.

d. Via the Multiplicity of the Causes of Rational Affects

Recall from section 2 that Spinoza takes the greatest good for all humans to be knowledge of God. Fortunately, the idea of God is one that we “really fully possess” (E5p20s, G II/294; cf. E2p45), and so our greatest good can be realized. Indeed, since everything in nature is a mode of God, in Spinoza’s view, the skilled philosopher can revive and meditate upon the idea of God on the basis of any experience whatsoever; every experience can occasion a train of thought that leads the mind back to its greatest good, and the joy that it brings. But these facts suggest a fourth way in which we may diminish the force of our passions, namely by means of “the multiplicity of causes by which affections related to common properties or to God are encouraged” (E5p20s, G II/293).

As with the third method, Spinoza here has in mind the comparative force of rational affections over irrational ones. While the third remedy appeals to Spinoza’s view that the objects of rational affections are constant and unchanging, the fourth remedy appeals to his view that the causes of rational affections are universal and omnipresent. This is relevant because Spinoza holds that

[A]s an image, or affect, is related to more things, there are more causes by which it can be aroused and encouraged, all of which the mind…considers together as a result of the affect itself. And so the affect is the more frequent, or flourishes more often, and engages the mind more. (E5p11d)

This is another way in which rational affects gradually become stronger and eventually may overpower the passionate affects. Passionate affects may be very strong for as long as their cause is present, but rational affects—in particular, the desire for knowledge and the love of God—have innumerably more and greater causes, and so rational affects will “flourish more often, and engage the mind more” than passionate ones (ibid).

e. Via the Re-Ordering of the Affects

The final remedy Spinoza offers is unlike the previous two in that it is an activity that we can intentionally perform to diminish the force of our passions. It is based upon the power that he believes the human mind has to intentionally join two ideas to one another by frequently thinking about them in unison, so that when the first idea occurs, the second idea is naturally aroused in the mind as well. One of the ways in which we may apply this power is by intentionally joining passionate affects together with mottos or rules, “sure maxims of life,” that are rational to follow whenever those passions take hold of us (E5p10s, G II/287).

Spinoza uses several examples to flesh out how this remedy is supposed to work; the main example he uses is the maxim “that hate is to be conquered by love, or nobility, not by repaying it with hate in return” (ibid). He writes,

[W]e ought to think about and meditate frequently on the common wrongs of men, and how they may be warded off best by nobility. For if we join the image of a wrong to the imagination of this maxim, it will always be ready for us…when wrong is done to us. (E5p10s, G II/288)

We originally determine that nobility is a virtue by means of rational inquiry. However, we are not best served by attempting to recreate the chain of reasoning that would lead us to act nobly when someone insults or harms us, but rather by having that maxim firmly committed to memory. Spinoza is admitting that in the heat of the moment, we are unlikely to be able to simply reason our way out of passion. But by means of carefully arranging the thoughts our passions are associated with in advance, we can ensure that “the wrong, or hate usually arising from [another’s wronging us], will occupy a very small part of the imagination, and will be easily overcome” (ibid).

In this way, a person may intentionally use irrational processes (memory and imagination) to safeguard his ability to act rationally: “he who will observe these [rules] carefully…and practice them, will soon be able to direct most of his actions according to the command of reason” (E5p10s, G II/289). By training ourselves to react in ways that, in our calmer, dispassionate moments, we recognize to be rational, we will be prepared to respond appropriately even when we lack time for reflection. This appears to connect to Spinoza’s claim in the preface to Ethics 4 that we ought to cultivate and hold before ourselves an idealized human being whom we can model our own behavior upon (discussed in section 2.3). Based on passages such as this, scholarship on Spinoza’s ethical theory has tended to depart from the traditional picture of imagination as something to be transcended through the use of reason; see, for example, Soyarslan (2014, 243–7), Steinberg (2014, 187–192) and James (2014, 154–159). Although Spinoza may rightly be called a rationalist in a number of senses, his account of how we achieve “freedom of mind, or blessedness” (E5pref) appears to depend as much on non-rational powers of imagination and memory as it does on reason.

4. Conclusion

In Spinoza’s view, human moral judgments are grounded in human desires or beliefs. However, in spite of this anti-realist metaethics, Spinoza endorses an intellectualist version of ethical egoism: reason dictates that we seek our greatest good, and this greatest good is understanding. He further tempers his ethical egoism by endorsing a version of contractarianism, according to which we may be bound to obey laws even when we recognize them to be irrational, and they seem to hinder our efforts to seek our greatest good, since the alternative (living without the help of civil society) will always be far worse. Finally, to aid us in the pursuit of understanding, which is often hindered by our passions, Spinoza provides a series of “remedies” by which the force of the passions may be mitigated.

Thus, in spite of the fact that Spinoza initially appears to have no interest in our contemporary notion of moral philosophy, the moral theory he develops has a surprising degree of depth and nuance. Indeed, since he builds his account of morality on top of a thoroughly naturalistic conception of the world, and of humanity’s place in it—and since our desire not to be mastered by our passions remains as strong today as it was in the 17th century—Spinoza’s moral philosophy remains alive for us today.

5. References and Further Reading

a. Primary Sources

Passages from Spinoza’s Ethics are cited in the usual way. For example, ‘E1p25’ refers to Ethics part 1 proposition 25; ‘E1p25d’ refers to the demonstration of that proposition; ‘E1p25s’ to its scholium; and ‘E1p125c’ to its corollary. Reference to the Gebhardt edition page numbers is provided where the usual citation would refer to a span of more than one page.

  • Descartes, Rene. The Philosophical Writings of Descartes [vols. I–II], eds. J. Cottingham, R. Stoothoff, and D. Murdoch. (Cambridge: Cambridge University Press, 1985). [CSM]
  • Hobbes, Thomas. Leviathan, with selected variants from the Latin edition of 1668, ed. E. Curley. (Indianapolis: Hackett, 1994). [L]
  • Spinoza, Benedict de. Opera, ed. C. Gebhardt. (Heidelberg: Carl Winters Universitätsverlag, 1925). [G]
  • Spinoza, Benedict de. The Collected Works of Spinoza, ed. and trans. E. Curley. (Princeton: Princeton University Press, 1988). [E]
  • Spinoza, Benedict de. The Letters, ed. and trans. S. Shirley. (Indianapolis: Hackett Publishing, 1995). [Ep.]
  • Spinoza, Benedict de. Theological-Political Treatise, ed. and trans. S. Shirley. (Indianapolis: Hackett, 1998). [TTP]

b. Secondary Sources

  • Bennett, Jonathan. A Study of Spinoza’s Ethics. (Indianapolis: Hackett, 1984).
  • Curley, Edwin. Behind the Geometrical Method. (Princeton: Princeton University Press, 1988).
  • Curley, Edwin. “Spinoza’s Moral Philosophy.” In Spinoza: A Collection of Critical Essays, ed. M. Grene (Notre Dame: University of Notre Dame Press, 1979), 354–376.
  • Della Rocca, Michael. “Egoism and the Imitation of the Affects in Spinoza.” In Spinoza on Reason and the ‘Free Man’: The Jerusalem Conference [vol. 4], eds. Y. Yovel and G. Segal. (New York: Little Room Press, 2004), 123–147.
  • Garrett, Don. “Spinoza’s ethical theory.” In Cambridge Companion to Spinoza, ed. D. Garrett. (Cambridge: Cambridge University Press, 1996), 267–314.
  • Garrett, Don. “‘A Free Man Always Acts Honestly, Not Deceptively’: Freedom and the Good in Spinoza’s Ethics.” In Spinoza: Issues and Directions, eds. E. Curley and P. F. Moreau (Leiden: Brill, 1990), 221–38.
  • Grey, John. “Spinoza on Composition, Causation, and the Mind’s Eternity.” British Journal for the History of Philosophy 22(3), 2014: 446–467.
  • Grey, John. “‘Use Them At Our Pleasure’: Spinoza on Animal Ethics.” History of Philosophy Quarterly 30(4), 2013: 367–388.
  • Hübner, Karolina. “Spinoza on Being Human and Human Perfection.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 124–142.
  • James, Susan. “Spinoza, the Body, and the Good Life.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 143–159.
  • Kisner, Matthew. Spinoza on Human Freedom: Reason, Autonomy, and the Good Life. (Cambridge: Cambridge University Press, 2011).
  • Kisner, Matthew. “Spinoza’s Benevolence: The Rational Basis of Acting for the Benefit of Others.” Journal of the History of Philosophy 47(4), 2009: 549–567.
  • Kisner, Matthew and Andrew Youpa (eds.). Essays on Spinoza’s Ethical Theory. (Oxford: OUP, 2014).
  • Kober, Gal. “For They Do Not Agree In Nature: Spinoza and Deep Ecology.” Ethics and the Environment 18(1), 2013: 43–65.
  • Lebuffe, Michael. From Bondage to Freedom: Spinoza on Human Excellence. (Oxford: OUP, 2010).
  • Lebuffe, Michael. “Spinoza’s Normative Ethics.” Canadian Journal of Philosophy 37(3), 2007: 371–392.
  • Lin, Martin. “The Power of Reason in Spinoza.” In Cambridge Companion to Spinoza’s Ethics, ed. O. Koistinen. (Cambridge: Cambridge University Press, 2009), 258–283.
  • Marshall, Colin. “Spinoza on Destroying Passions with Reason.” Philosophy and Phenomenological Research 85(1), 2012: 139–160.
  • Melamed, Yitzhak. “Spinoza’s Anti-Humanism: An Outline.” In The Rationalists: Between Tradition and Innovation, eds. C. Fraenkel, D. Perinetti, and J. E. H. Smith. (Boston: Kluwer, 2011), 147–166.
  • Nadler, Steven. Spinoza’s Ethics: An Introduction. (Cambridge: Cambridge University Press, 2006).
  • Naess, Arne. “Spinoza and Ecology.” Philosophia 7, 1977: 45–54.
  • Rutherford, Donald. “Spinoza and the Dictates of Reason.” Inquiry 51(5), 2008: 485–511.
  • Soyarslan, Sanem. “From Ordinary Life to Blessedness.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 236–257.
  • Steinberg, Justin. “Following a Recta Ratio Vivendi: The Practical Utility of Spinoza’s Dictates of Reason.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 178–196.
  • Verbeek, Theo. Spinoza’s Theologico-Political Treatise: Exploring ‘The Will of God’. (Burlington, VT: Ashgate, 2003).
  • Wilson, Catherine. “The Strange Hybridity of Spinoza’s Ethics.” In Early Modern Philosophy, eds. C. Mercer and E. O’Neill. (Oxford: OUP, 2005), 86–99.
  • Youpa, Andrew. “Spinoza’s Theories of Value.” British Journal for the History of Philosophy 18(2), 2010: 209–229.
  • Youpa, Andrew. “Spinoza’s Theory of the Good.” In Cambridge Companion to Spinoza’s Ethics, ed. O. Koistinen. (Cambridge: Cambridge University Press, 2009), 242–57.

 

Author Information

John Grey
Email: jrtgrey@gmail.com
Michigan State University
U. S. A.

The Upaniṣads

The Upaniṣads are ancient texts from India that were composed orally in Sanskrit between about 700 B.C.E. and 300 B.C.E. There are thirteen major Upaniṣads, many of which were likely composed by multiple authors and are comprised of a variety of styles. As part of a larger group of texts, known as the Vedas, the Upaniṣads were composed in a ritual context, yet they mark the beginning of a reasoned enquiry into a number of perennial philosophical questions concerning the nature of being, the nature of the self, the foundation of life, what happens to the self at the time of death, the good life, and ways of interacting with others. As such, the Upaniṣads are often considered to be the fountainhead of the subsequent rich and varied philosophical tradition in India. The Upaniṣads contain some of the oldest discussions about key philosophical terms such as ātman (the self), brahman (ultimate reality), karma, and yoga, as well as saṃsāra (worldly existence), mokṣa (enlightenment), puruṣa (person), and prakṛti (nature)—all of which would continue to be central to the philosophical vocabulary of later traditions. In addition to contributing to the development of a discursive language, the Upaniṣads further frame later philosophical debates by their exploration of a number of means of attaining knowledge, including deduction, comparison, introspection, and debate.

Table of Contents

  1. The Upaniṣads and the Vedas
    1. Main Upaniṣads
    2. Minor Upaniṣads
  2. From Ritual to Philosophy
  3. The Self
  4. Ātman and Brahman
  5. Karma, Saṃsāra, and Mokṣa
  6. Ethics and the Upaniṣads
  7. The Upaniṣads and Hindu Darśanas before Vedānta
  8. The Upaniṣads and Vedānta
  9. The Upaniṣads as Philosophy
  10. The Upaniṣads in the Modern Period
  11. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Upaniṣads and the Vedas

a. Main Upaniṣads

The Upaniṣads are the fourth and final section of a larger group of texts called the Vedas. There are four different collections of Vedic texts, the Ṛgveda, Yajurveda, Sāmaveda, and Atharvaveda, with each of these collections containing four different layers of textual material: the Saṃhitās, Brāhmaṇas, Āraṇyakas, and Upaniṣads. Although each of these textual layers has a variety of orientations, the Saṃhitās are known to be largely comprised of  hymns praising gods and the Brāhmaṇas are mostly concerned with describing and explaining Vedic rituals. The Āraṇyakas and Upaniṣads are also firmly rooted in ritual, but with both groups of texts there is an increasing emphasis on understanding the meaning of ritual, while some sections of the Upaniṣads seem to move completely away from the ritual setting into naturalistic and philosophical inquiry about the processes of life and death, the workings of the body, and the nature of reality.

The Vedic Upaniṣads are widely recognized as being composed during two chronological stages. The texts of the first period, which would include the Bṛhadāraṇyaka (BU), Chāndogya (CU), Taittirīya (TU), Aitareya (AU) and Kauṣītakī (KsU), are generally dated between 700 and 500 B.C.E., and are considered to predate the emergence of the so-called heterodox traditions, such as the Buddhists, Jains, and Ājīvikas. Scholarly consensus dates the second stage of Vedic Upaniṣads, which includes the Kena (KeU), Kaṭha (KaU), Īśā (IU), Śvetāśvatara (SU), Praśna (PU), Muṇḍaka (MuU), Māṇḍūkya (MaU), and Maitrī (MtU), between 300-100 B.C.E. (Olivelle 1998: 12-13). The older Upaniṣads are primarily composed in prose, while the later ones tend to be in metrical form, but any individual text may contain a diversity of compositional styles. Additionally, many individual Upaniṣads consist of various types of material, including creation myths, interpretations of ritual actions, lineages of teachers and students, magical formulae, procreation rites, and narratives and dialogues about famous teachers, students, and kings.

The so-called Hindu darśanas—Nyāya, Vaiśeṣika, Mīmāṃsā, and Vedānta—do not adhere to the chronology above, as they regard all the Vedic Upaniṣads as śruti, meaning a timeless revealed knowledge. The remaining two Hindu darśanas—Sāṃkhya and Yoga—are usually read as supporting the Vedas.  However, when tracing the historical development of philosophical ideas, it is helpful to note some differences in orientation between the two stages of Upanishadic material. While all the Upaniṣads devote considerable attention to topics such as the self (ātman) and ultimate reality (brahman), as well as assume some version of the karma doctrine, the earlier texts tend to characterize ultimate reality in abstract and impersonal ways, while the later Upaniṣads, particularly the Īśā and Śvetāśvatara, are more theistic in orientation. Meanwhile, the later Upaniṣads explicitly address a number of key topics such as yoga, mokṣa, and saṃsāra, all of which would continue to be central aspects of subsequent Indian philosophy.

b. Minor Upaniṣads

In addition to those affiliated with the Vedas, there are literally hundreds of other texts bearing the name “Upaniṣad.” These texts have been grouped together by scholars according to common themes, such as the Yoga Upaniṣads (Upaniṣads on Yoga), the Saṃnyāsa Upaniṣads (Upaniṣads on Renunciation), the Śaiva Upaniṣads (Upaniṣads on the Hindu God Śiva), and the Vaiṣṇava Upaniṣads (Upaniṣads on the Hindu God Viṣṇu) (see Deussen 1980 and Olivelle 1992). The majority of these texts were composed between the 2nd and 15th centuries CE, although texts referred to as  “upaniṣad” have continued to be composed up to the present day. Many of the post-Vedic Upaniṣads further develop core concepts from the Vedic Upaniṣads, such as ātman, brahman, karma, and mokṣa. In addition to a shared conceptual world, the post-Vedic Upaniṣads often quote extensively from the earlier texts and feature many of the same teachers and students, such as Yājñavalkya, Janaka, and Śaunaka.

2. From Ritual to Philosophy

Despite their significant contribution to subsequent Indian philosophical traditions, there has been disagreement about whether or not the Upaniṣads themselves constitute philosophy. Much of this debate depends, of course, on how one defines philosophy. A recurring argument as to why the Upaniṣads might not be considered philosophy is because they do not contain a unified or a systematic position. This, however, largely reflects the composite and fragmented nature of the texts. Rather than being characterized as unsystematic, the diversity of teachings can be better understood when considering the fact that different texts were composed within the context of separate and often competing scholarly traditions or schools (śākhas). Accordingly, the Upaniṣads do not have a unified philosophical system, but rather contain a number of overlapping themes and mutual interests. Nonetheless, there can be considerable uniformity within a particular text or within a group of texts ascribed to the same school, and even more so according to the lessons ascribed to any particular teacher. In addition to the distinct philosophical agendas of different texts, we see different teachers articulate their teachings within the context of competition over recruiting students, securing patronage, and debating with rivals in public contests. With this context in mind, it is not surprising to find various, sometimes conflicting, teachings throughout the texts.

Due to their connection with previous Vedic material, the Upaniṣads generally assume a ritual context, containing many passages that explain the significance of ritual actions or interpret mantras (sacred verses) uttered during the ritual. One of the most prevalent tendencies to continue from the ritual texts is an attempt to identify the underlying connections (bandhus) that exist among different orders of reality. Often these connections were made among three spheres: the cosmos, the body of the sponsor of the ritual (yajamāna), and the ritual grounds—in other words, between the macrocosm, the microcosm, and the ritual. An illustrative example appears at the beginning of the Bṛhadāraṇyaka Upaniṣad, where the different body parts of the horse in the sacrifice (aśvamedha) are compared to the different elements, regions, and intervals of time in the cosmos (BU 1.1). The implication is that by reflecting on the relational composition of the horse, one can understand the structure of the universe.

There have been some debates regarding the meaning of the word “upaniṣad,” with the components of the word (upa + ni + sad) suggesting texts that were to be learned ‘sitting down near’ one’s teacher. However, the word is not employed in this way in the texts, nor in existing commentaries. Rather, in its earliest textual contexts, the word “upaniṣad” takes on a meaning similar to bandhu, describing a connection between things, often presented in a hierarchical relationship. In these contexts, upaniṣad is often interpreted as the most essential or most fundamental connection. Moreover, “upaniṣad” designates equivalences between components of different realms of reality that were not considered to be observable by the senses, but remained concealed and obscured, and required special knowledge or understanding. On several occasions, “upaniṣad” means ‘secret teaching’ (that is, CU 1.1.10; 1.13.4; 8.8.4; 4.2.1; 5.5.3-4), a notion that is reinforced by the use of other formulations such as guhyā ādeśā (‘hidden instruction’; BU 3.5.2) and para guhya (‘supreme secret’; KaU 3.17; SU 6.22). In the Bṛhadāraṇyaka Upaniṣad, the word “upaniṣad” is equated with the formulation satyasya satyam (BU 2.2.20)—‘the truth behind the truth’— an expression suggesting that an upaniṣad is a truth or reality beyond that which appears to be true.

Whether discussing the essence of life or the source of a king’s power, the Upaniṣads show an interest in establishing a firm foundation or an ontological grounding for different aspects of reality, and ultimately, for reality as a whole. One of the terms most associated with these discussions is brahman. The oldest usages of the word are closely connected with the power of speech, with brahman meaning a truthful utterance or powerful statement. In the Upaniṣads, brahman retains this connection with speech, but also comes to refer to the underlying reality or the ontological foundation. In some passages brahman is associated with truth (TU 1.1), while on other occasions its is linked with immortality (CU 2.23.1) or characterized as a heavenly abode (BU 4.4.7-8).

3. The Self

One of the most widely discussed topics throughout both the early and late Upaniṣads is the self (ātman). The word “ātman” is a reflexive pronoun, likely derived from √an (to breathe). Even in the Ṛgveda (c.1200 B.C.E.), the earliest textual source from ancient India, ātman had already a wide range of lexical meanings, including ‘breath’, ‘spirit’, and ‘body’. By the time of the Upaniṣads, the word was used in a variety of ways, sometimes referring to the material body, but often designating something like an essence, a life-force, consciousness, or ultimate reality.

One of the most well known teachings of ātman appears in the Chāndogya Upaniṣad (6.1-16), as the instruction of the brahmin Uddālaka Āruṇi to his son Śvetaketu. Uddālaka begins by explaining that one can know the universal of a material substance from a particular object made of that substance: by means of something made of clay, one can know clay; by means of an ornament made of copper, one can know copper; by means of a nail cutter made of iron, one can know iron. Uddālaka uses these examples to explain that objects are not created from nothing, but rather that creation is a process of transformation from an original being (sat) which emerges into the multiplicity of forms that characterizes our everyday experiences. Uddālaka’s explanation of creation is often assumed to have influenced the satkāryavāda theory—the theory that the effect exists within the cause—which was accepted by the Sāṃkhya, Yoga, and Vedānta darśanas.

Later in his instruction to Śvetaketu, Uddālaka makes a series of inferences from comparisons with empirically observable natural phenomena to explain that the self is a non-material essence present in all living beings. He first uses the example of nectar, collected by bees from different sources, but when gathered together becomes an undifferentiated whole. Similarly, water flowing from different rivers merges together without distinction when reaching the ocean. Uddālaka then asks Śvetaketu to conduct two simple experiments. In the first he instructs his son to cut a banyan fruit, and then the seed within the fruit, only for his son to find that he cannot observe anything inside the seed. Uddālaka compares the fine essence of the seed, which cannot even be seen, to the self. Uddālaka then tells Śvetaketu to place some salt in water. When returning the next day, Śvetaketu cannot see the salt anywhere in the water, but by tasting the water he perceives that it is equally distributed throughout. Uddālaka concludes that, like salt in water, the self is not immediately discernible, but yet permeates the entire body. After each of these comparisons with natural phenomena Uddālaka brings attention back to Śvetaketu, emphasizing that the self operates the same way in him as it does in all living beings. Repeating the phrase ‘you are that’ (tat tvam asi) throughout his discourse, the thrust of Uddālaka’s teaching is that the self is both the essence that connects parts with the whole and the constant that remains the same even while taking on different forms. Thus, he offers an organic understanding of ātman, characterizing the self in terms of the life force that animates all living beings.

Yājñavalkya, the most prominent teacher in the Bṛhadāraṇyaka Upaniṣad, characterizes ātman more in terms of consciousness than as a life-giving essence. In a debate that pits him against Uddālaka—his senior colleague and, by some accounts, his former teacher (BU 6.3.7; 6.5.3)—Yājñavalkya explains that the self is the inner controller (antaryāmin), present within all sensing and cognizing, yet at the same time distinct (BU 3.7.23). Here, Yājñavalkya characterizes the self as that which has mastery over the otherwise distinct psycho-physical capacities. He goes on to explain that we know the existence of the self through actions of the self, through what the self does, not through our senses—that the self, as consciousness, cannot be an object of consciousness.

Another recurring theme in Yājñavalkya’s discussion with Janaka is that the self is described as consisting of various parts, but not reducible to any (e.g BU 4.4.5; see also TU 2.2.1). Similarly, in a creation myth at the beginning of the Āitareya Upaniṣad, ātman is cast as a creator god, who creates the various elements and bodily functions from himself (ĀU 1.3.11). As with Yājñavalkya’s teaching, in this passage the functions of the body and cognitive capacities are seen to be components of the self and even evidence of the self, but the self cannot be reduced to any particular part. Such examples emphasize that an understanding of the self cannot be attained through observing how the self operates in just one faculty, but by means of observing the self in relation to a number of psycho-physical faculties, and their relationship with each other. In addition to being portrayed as the agent or inner controller (antaryāmin) of sensing and cognizing, the self is characterized as an underlying base or foundation (pratiṣṭha) of all the sense and cognitive faculties. Throughout his teachings Yājñavalkya describes the self as being hidden or behind that which is immediately perceptible, suggesting that the self cannot be known by rational thought or described in conventional language because it can never be the object of thought or knowledge. Here, Yājñavalkya draws attention to the limitations of language, suggesting that because the self cannot be an object of knowledge it cannot have attributes, and therefore can only be described by using negative propositions.

Another prominent teacher of the self is Prajāpati, the creator god of Vedic ritual texts, who is recast in the Chāndogya Upaniṣad as a typically aloof guru, who is reluctant to disseminate his teachings (CU 8.7-12). Similar to Yājñavalkya, Prajāpati conceptualizes the self in terms of consciousness, describing ātman as the agent responsible for sensing and cognizing: ātman is ‘the one who is aware’ (CU 8.12.4-5). However, despite some similarities with Yājñavalkya’s teaching of ātman, Prajāpati seems to reject some of his positions. Prajāpati’s teaching is presented in the context of his instruction to the god Indra, taking place during several episodes over a period of more than one hundred years. In his first teaching Prajāpati defines the self as the material body, and sends Indra away thinking he has learned the true teaching. Before going back to the other gods, however, Indra realizes that this teaching cannot be true, and returns to Prajāpati to learn more. This pattern continues several times, before Prajāpati finally presents ātman as the ‘one who is aware’ of his final and true teaching. One of the teachings that Prajāpati presents as false, or at least as incomplete, is a description of ātman in terms of dreamless sleep, a teaching of the self that Yājñavalkya describes as the ‘highest goal’ and ‘the highest bliss’ in his instruction to King Janaka in the Bṛhadāraṇyaka Upaniṣad (4.3.32).

Despite the diversity among these teachings, most of the discussions represent a different set of concerns than those found in earlier Vedic texts, with many teachings focusing on the human body and individual person as opposed to the primordial or ideal body, as often discussed in Vedic rituals. Rather than assuming a correspondence between the human body and the universe, some of the teachings about the self in the Upaniṣads begin to show an interest in the fundamental essence of life.

4. Ātman and Brahman

Perhaps the most famous teaching of the self, the identification of ātman and brahman, is delivered by Śāṇḍilya in the Chāndogya Upaniṣad. After describing ātman in various ways, Śāṇḍilya equates ātman with brahman (CU 3.14.4), implying that if one understands brahman as the entire world, and one understands that the self is brahman, then one becomes the entire world at the time of death.

Although Śāṇḍilya’s teaching of ātman and brahman is often considered the central doctrine of the Upaniṣads, it is important to remember that this is not the only characterization either of the self or of ultimate reality. While some teachers, such as Yājñavalkya, also equate ātman with brahman (BU 4.4.5), others, such as Uddālaka Āruṇi, do not make this identification. Indeed, Uddālaka, whose famous phrase tat tvam asi is later taken by Śaṅkara to be a statement of the identity of ātman and brahman, never uses the term “brahman”—neither in his instruction to his son Śvetaketu, nor on any other of his many appearances in the Upaniṣads. Moreover, it is often unclear, even in Śāṇḍilya’s teaching, whether linking ātman with brahman refers to the complete identity of the self and ultimate reality, or if ātman is considered an aspect or quality of brahman. Such debates about how to interpret the teachings of the Upaniṣads have continued throughout the Indian philosophical tradition, and are particularly characteristic of the Vedānta darśana.

Furthermore, while most teachings about brahman assume that the world emerged from one undifferentiated abstract cosmic principle, there are a number of passages explaining creation in terms of a more materialist point of view, describing the world as coming forth from an initial natural element, such as water or air. The Bṛhadāraṇyaka Upaniṣad (5.1), for example, contains a teaching attributed to the son of Kauravyāyanī, depicting brahman as space. This same section of the Bṛhadāraṇyaka Upaniṣad (5.5.1) includes a passage describing the world as beginning from water. Similarly, in the Chāndogya Upaniṣad (4.3.1-2), Raikva traces the beginning of the world to wind in the cosmic sphere, and breath in the microcosm.

Returning to the self, and keeping in mind later philosophical developments, it is also worth noting that the Upaniṣads often present ātman in ways that contrast with the changeless and inactive descriptions of the self as articulated by traditions such as Sāṃkhya, Yoga, and Advaita Vedānta. As we have seen, the self can be characterized as both active and dynamic: as the inner controller (antaryāmin), the self is depicted as the agent or actor behind all sensing and cognizing faculties (for example, BU 3.7.23); while as a creator god, ātman is cast as a personal deity—closely resembling Prajāpati—from whom all creation emanates (BU 1.4.1; 1.4.17; TU 2.1; AU 1.1).

One feature of the self that is quite consistent throughout the Upaniṣads and continues to be shared by a number of subsequent schools of Hindu philosophy is that knowledge of ātman can lead to some sort of liberation or ultimate freedom. While the Sāṃkhya and Yoga school would conceptualize such emancipation as kaivalya—abstraction, autonomy from nature—and Advaita Vedāntins as freedom from ignorance (avidya),  in the Upaniṣads the ultimate goal achieved through knowledge of the self is primarily freedom from death. Nonetheless, a prominent philosophical strand in the Upaniṣads, particularly in the teachings of Yājñavalkya, is that ātman dwells within the body when it is alive, that ātman, in one way or another, is responsible for the body being alive, and that ātman does not die when the body dies, but rather finds a dwelling place in another body. Such depictions seem to have been a catalyst for or been developed alongside early Buddhist conceptions of selfhood. The Buddhists explicitly rejected any notion of an indivisible and unchanging self, not only introducing the term “not-self” (anātman in Sanskrit; anattā in Pāli) to describe the lack of any fixed essence, but also explaining karmic continuity from one lifetime to the next in terms of the five skandhas—a theory maintaining that what Upanishadic thinkers take to be a unified self is really made of five components, all of which are subject to change.

5. Karma, Saṃsāra, and Mokṣa

Karma (“karman”) is another central concept in the Indian philosophical tradition that finds some of its first philosophical articulations in the Upaniṣads. Literally meaning ‘action’, karma emerges out of a ritual context where it refers to any ritual action, which, if performed correctly, yields beneficial results, but if performed incorrectly, brings about negative consequences. The Upaniṣads do not offer any explicit theory of karma, but do contain a number of teachings that seem to extend the notion of karma beyond the ritual context to more general understandings of moral retribution and of causality. Yājñavalkya, for example, when asked by Ārtabhāga about what happens to a person after death, responds that a person becomes good by good action and bad by bad action (BU 3.2.13). Here and elsewhere, one of Yājñavalkya’s fundamental assumptions is that present actions have consequences in the future and that our present circumstances have been shaped by our past actions. While this law-like character of karma suggests that the consequences of one’s actions shape one’s future, Yājñavalkya does not give any indication that the future is completely determined. Rather, he seems to suggest that one can create good consequences in the future by performing good actions in the present. In other words, Yājñavalkya presents karma more as a theory to promote good actions, than as a fatalistic doctrine in which the future is fixed.

While Yājñavalkya assumes that karma takes place across lifetimes, he does not attempt to explain the mechanisms of rebirth. In the Chāndogya Upaniṣad, however, King Pravāhaṇa Jaivali is more specific about how karma and rebirth operate, describing the link between them in terms of a naturalistic philosophy (CU 5.4-10). In a dialogue that also appears in the Bṛhadāraṇyaka Upaniṣad (BU 6.2.9-16), but without the explicit connection to karma, Pravāhaṇa discloses the teaching of the five fires (pañcāgnividyā) to Uddālaka Āruṇi. Pravāhaṇa’s instruction describes human life as part of a cycle of regeneration, whereby the essence of life takes on different forms as it passes through different levels of existence: when humans die, they are cremated and travel in the form of smoke to the other world (the first fire), where they become soma; as soma they enter a rain cloud (the second fire) and become rain; as rain they return to earth (the third fire), where they become food; as food they enter man (the fourth fire), where they become semen; as semen they enter a woman (the fifth fire) and become an embryo. According to Pravāhaṇa, those who know the teaching of the five fires follow the path of the gods and enter the world of brahman, but those who do not know this teaching, will follow the path of the ancestors and continue to be reborn.

Pravāhaṇa states that knowledge of the teaching of the five fires will affect the conditions of one’s future births. He explains that people who are pleasant will enter a ‘pleasant womb’ such as the womb of a brahmin, a kṣatriya, or a vaiśya. But that people of foul behavior can expect to enter the womb of a dog, a pig, or an outcaste (CU 5.10.7). In this teaching, Pravāhaṇa demonstrates the link between karma and rebirth by specifying different types of animals (dogs, pigs) and different types of matter (smoke, rain, food, semen) through which karma operates. By implication, karma not only applies to the causes and effects of human actions, but also includes non-human animals and other forms of organic and inorganic matter. Moreover, karma is not directed by a divine being, but rather is described as an independent, natural process. As such, karma is presented as an impersonal moral force that operates throughout the totality of existence, balancing out the consequences of good and bad action. Here, we see that Uddālaka Āruṇi’s teaching implies that everyone’s actions have moral consequences and that all the actions of humans and non-humans are interconnected.

Such discussions linking actions in one lifetime to consequences in a future one would become widely accepted in subsequent philosophical discourse—not only among Hindus, but also by Buddhists and Jains, and, to a certain extent, by the Ājīvikas. In subsequent developments across these traditions, karma would often be conceptualized in terms of intention and much of what we might describe as ethics was to be focused on ways to cultivate a state of mind that would generate positive rather than negative intentions.

Despite the development of ideas about karma, the earliest Upaniṣads generally do not contain the assumption that life is suffering (duḥkha), or illusion (māyā), or ignorance (avidyā)—views that would later dominate discussions of karma and rebirth. Nonetheless, we do see the introduction of the term “saṃsāra” in the comparatively late Kaṭha (3.7) and Śvetāśvatara (6.16) Upaniṣads. Literally meaning, ‘that which turns around forever’, saṃsāra refers to the cycle of birth, life, death, and rebirth. All living creatures, including the gods, are considered to be a part of saṃsāra. Accordingly, death is not considered to be final, and rebirth is an essential aspect of existence.

Closely related to saṃsāra is mokṣa, the concept that one can escape or be released from the endless cycle of repeated births. Similar to saṃsāra, the Upaniṣads do not contain an explicit theory about mokṣa, with the term “mokṣa” only assuming its connotations of liberation in the later texts (that is, SU 6.16). The Hindu darśanas would subsequently consider mokṣa to be a fundamental teaching of all the Upaniṣads, but the texts themselves, particularly the early ones, focus much more attention on securing wealth, status, and power in this lifetime than on describing existence as an endless cycle. They also tend to present life as desirable, and not as a condition from which people need release or escape. One of the most common soteriological goals is immortality, amṛta, which literally means ‘not dying’. The Upaniṣads describe immortality in different ways, including having a long life span, surviving death in the heavenly world, becoming one with the essential being of the universe, and being preserved in the social memory.

6. Ethics and the Upaniṣads

Philosophy in the Upaniṣads does not merely consist of abstract claims about the nature of reality, but is also presented as a way of living one’s life. In Yājñavalkya’s teaching to Janaka, for example, knowledge of ātman is associated with a change in one’s disposition and behavior. As we have seen, karma is characterized as a natural moral process, with knowledge of the self as a way out of that process. In this respect, a fundamental assumption throughout many teachings of the self is that it is untouched by karma. Yājñavalkya teaches Janaka that knowledge of the self is beyond virtuous (kalyāṇa) and evil (pāpa)—that, through knowledge of the self one reaches the world of brahman, where the good or bad actions of one’s life do not follow (BU 4.4.22).

Yājñavalkya explains that in the world of brahman a thief is not a thief, a murderer is not a murderer, an outcaste not an outcaste, a mixed-caste person (paulkasa) is not a mixed-caste person, a renunciate (śramaṇa) is not a renunciate, and an ascetic not an ascetic, that neither the good (puṇya) nor the evil (pāpa) follow him (BU 4.3.22; see also TU 2.9.1). In using these examples Yājñavalkya illustrates the degree to which knowledge of the self is beyond everyday notions of moral behavior. In other words, he seems to be saying that even if one has committed evil deeds, one can still be liberated from karma by means of knowing the self. Yet Yājñavalkya is not suggesting that one can continue to perform ‘evil’ deeds without suffering karmic retribution. Rather, as he asserts later in his discussion with Janaka: when one is knowledgeable, one necessarily acts morally. Yājñavalkya explains that a man who has proper knowledge becomes calm (śānta), restrained (dānta), withdrawn (uparata), patient (titikṣu), and composed (samāhita) (BU 4.4.23). Here Yājñavalkya characterizes knowledge of the self as a change in one’s disposition. In other words, one who is a knower of the self becomes a person of good character and—by definition—would not perform an evil action.

While Yājñavalkya talks about becoming calm, restrained, withdrawn, patient, and composed, these dispositions are not presented as virtues to cultivate for the sake of knowledge, but rather as consequences of knowing ātman. Subsequent texts would devote considerable attention to how one should cultivate oneself in order to achieve the highest knowledge. For example, both the eight-fold path of the Buddhist Nikāyas and the eight limbs in the Yoga Sūtra suggest that one needs to live a moral life in order to achieve true knowledge. In Yājñavalkya’s teaching about ātman, however, there is more attention paid to the objective of knowing the self than the ethical means of controlling the self.

Despite the lack of details about the path to knowledge, Yājñavalkya nevertheless connects  knowledge of ātman with particular practices, explaining to Janaka that brahmins seek to know ātman by means of vedic recitation (vedānuvacana), sacrifice (yajña), gift-giving (dāna), austerity (tapas), and fasting (BU 4.4.22). Yājñavalkya elaborates, claiming that by knowing the self, one becomes a sage (muni), undertaking an ascetic and peripatetic lifestyle (BU 4.4.22). Here, Yājñavalkya implies that those who come to know ātman will become renunciates—that knowledge of the self not only brings about certain dispositions or a certain character, but also provokes a particular lifestyle. Similarly, in the Muṇḍaka Upaniṣad, Aṅgiras teaches Śaunaka that the self can be mastered by means of asceticism and celibacy, among other practices (MU 3.1.5).

With the connection between knowledge and lifestyle, there are notable gender implications of Upanishadic teachings. Yājñavalkya, for example, assumes that the main knowers of the self will be brahmin men, even claiming that through knowledge of the self one can become a brahmin (BU 4.4.23). The word “ātman” is grammatically masculine and teachings of the self are directed specifically towards a male audience and articulated in overtly androcentric metaphors (Black 2007: 135-41). Nonetheless, a number of teachings of the self suggest that true knowledge goes beyond gender distinctions. As we have seen, Uddālaka Āruṇi describes the self as an organic, universal life-force, while Yājñavalkya teaches that one who knows the self will see the self in all living beings (BU 4.4.23). It is also noteworthy that the Upaniṣads depict several women—such as Gārgī and Maitreyī—as participating in philosophical discussions and debates (Black 2007: 48-67; Lindquist 2008).

7. The Upaniṣads and Hindu Darśanas before Vedānta

The influence of the Upaniṣads on the so-called ‘Hindu’ darśanas is more oblique than explicit, with few direct references, yet with many of the dominant terms and concepts seemingly inherited from them. Many of the six main Hindu schools officially recognize the Upaniṣads as a source of philosophy in so far as they recognize śabda as a valid means for attaining knowledge. Śabda literally means ‘word’, but in philosophical discourse it refers to verbal testimony or reliable authority, and is sometimes taken to refer specifically to śruti. Despite the nominal acceptance of śabda as a pramāṇa, however, the Upaniṣads are only cited occasionally in the surviving texts, and rarely as a source to validate fundamental arguments, before the emergence of the Vedānta school in the 7th century.

Notably, the Upanishadic notion of self—as a spiritual essence separate from the physical body—is generally accepted by the classical Hindu philosophical schools. The Nyāya and Mīmāṃsā darśanas, for example, which do not cite the Upaniṣads to prove its existence, nevertheless describe the self as an immaterial substance that resides in and acts through the body. In addition to conceptual similarities with certain passages from the Upaniṣads, both schools seem to consider the Upaniṣads as texts that specialize in the self. The Nyāya philosopher Vātsyāyana (c. 350-450 C.E.), for instance, characterizes the Upaniṣads as dealing with the self.

Similarly, the early texts of the Sāṃkhya and Yoga darśanas do not refer to the Upaniṣads when making their fundamental arguments, but do seem to inherit much of their terminology, as well as some of their views, from them. At the beginning of Uddālaka Āruṇi’s instruction to Śvetaketu in the Chāndogya Upaniṣad (6.2-5), for instance, he describes existence (sat) as consisting of three forms (rūpas): fire (red), water (white), and food (black)—a scheme that closely resembles the later Sāṃkhya doctrine of prakṛti and the three guṇas. The Śvetāśvatara Upaniṣad (4.5), the oldest extant text to use the word “sāṃkhya” (5.2), seems to build on Uddālaka’s three-fold scheme when describing the unborn as red, white, and black. Also, a number of core terms in Sāṃkhya philosophy first appear in the Upaniṣads, such as ahaṃkāra (CU 7.25.1) and the tattvas (BU 4.5.12), while some passages contain groups of terms appearing together in ways that are similar to how they appear in later Sāṃkhya texts: the Kaṭha Upaniṣad (3.10-11), for example, lists a hierarchy of principles including person (puruṣa), discernment (buddhi), mind (manas), and the sense capacities (indriyas).

A number of details about the practice of yoga, which would become more systematized by the Yoga darśana, are also first found in the Upaniṣads. The Kaṭha (3.3-13; 6.7-11) and Śvetāśvatara (2.8-11) Upaniṣads both contain some of the earliest descriptions of exercises for controlling the senses, breathing techniques, and bodily postures, with the Śvetāśvatara Upaniṣad (for example, 2.15-17) making explicit connections between yogic practice and union with a personal god—a connection that would be of central importance in the Yoga darśana. The Maitrī Upaniṣad (6-7) has the most extensive and systematic discussion of yoga in the Upaniṣads, containing a number of parallels with the Yoga Sūtra.

In addition to employing terms and concepts from the Upaniṣads, there are occasions when classical Indian philosophers refer to the Upaniṣads directly. Vātsyāyana, of the Nyāya school, quotes passages from the Bṛhadāraṇyaka and Chāndogya Upaniṣads when discussing mokṣa, the means of attaining it, and the stages of life. Additionally, the grammarian Patañjali (c.150 B.C.E.) argues that the study of grammar is useful for a correct understanding of passages from the Upaniṣads, and thus for attaining mokṣa.

Such examples indicate that the philosophers of classical Hindu philosophy knew the Upaniṣads quite well and would dip into the texts from time to time to provide an analogy or, occasionally, to support one of their arguments. However, the early surviving texts of the Nyāya, Vaiśeṣika, Mīmāṃsā, Sāṃkhya, and Yoga schools do not tend to use the Upaniṣads to validate their core positions. The Vaiśeṣika Sūtra (3.2.8), for example, agrees that the self is discussed in the Upaniṣads, but then argues that the proof of the existence of the self should not be established exclusively by means of śruti, but also can be determined through inference. Additionally, none of the early schools produced a commentary on the Upaniṣads, nor did any of them aim to offer an interpretation on the Upaniṣads as a whole. As such, the Upaniṣads provided a general philosophical framework, as well as serving as a repository for terms and analogies, but none of the early schools claimed the texts for themselves.

An interesting illustration of this point is that competing schools would sometimes recognize that their rival’s positions were also to be found in the Upaniṣads. The Nyāya philosopher Jayanta Bhaṭṭa even finds the positions of the heterodox Lokāyata darśana, or Materialist school, in the Upaniṣads. In the context of criticizing the validity of śabda as a pramāṇa, Bhaṭṭa argues that if śabda were a valid means for establishing knowledge, then even the doctrines of the Lokāyatas must be true, because their doctrines can be found in the Upaniṣads. Due to a lack of sources from the Lokāyata school, we do not know if they ever referred to the Upaniṣads in their own texts, but Bhaṭṭa’s argument is illustrative of a general reluctance of most of the early schools to put too much stake in śruti as a means of knowledge. His comments are also an acknowledgement that the Upaniṣads contain a variety of viewpoints.

8. The Upaniṣads and Vedānta

The oldest surviving systematic interpretation of the Upaniṣads is the Brahma Sūtra (200 B.C.E.—200 C.E.), attributed to Bādarāyaṇa. Although technically not a commentary (that is, it is a sūtra rather than a bhāṣya), the Brahma Sūtra is an explanation of the philosophy of the Upaniṣads, treating the texts as the source for knowledge about brahman. Despite being considered a Vedānta text, the Brahma Sūtra (a.k.a. Vedānta Sūtra) was composed centuries before the establishment of Vedānta as a philosophical school. The Brahma Sūtra uses the Upaniṣads to refute the position of dualism, as put forth by the Sāṃkhya school. Like Śaṅkara does later, the Brahma Sūtra (1.1.3-4) states that śruti is the source of all knowledge about brahman. Additionally, the Brahma Sūtra maintains that mokṣa is the ultimate goal as opposed to action or sacrifice.

Centuries later, the Vedānta darśana was the first philosophical school to attempt to present the Upaniṣads as holding a unified philosophical position. Vedānta means ‘end of the Vedas’ and is often used to refer specifically to the Upaniṣads. The school divides the Vedas into two sections: karmakānda, the section of spiritual exegesis (consisting of the Saṃhitās and the Brāhmaṇas), and jñānakānda, the section of knowledge (consisting of the Upaniṣads, and to a certain extent, the Āraṇyakas). According to the Vedānta school, the ritual section contains detailed instructions of how to perform the rituals, whereas the Upaniṣads contain transcendent knowledge for the sake of achieving mokṣa. There are three main branches of the Vedānta school: Advaita Vedānta, Viśiṣtādvaita Vedānta, and Dvaita Vedānta. Although these branches would put forth distinct philosophical positions, they all took śabda as the exclusive means to knowledge about its central doctrines and considered the Upaniṣads, the Brahma Sūtra, and Bhagavad Gītā as its core texts (prasthānatraya). Despite disagreeing with each other, all three of the most well known philosophers of the Vedānta school—Śaṅkara, Rāmānuja, and Madhva—wrote commentaries on the Upaniṣads, presenting them as having a single, and consistent philosophical position.

The most well-known philosopher of the Vedānta school was Śaṅkara (c. 700 C.E.), whose interpretations of the Upaniṣads made a major impact on the Indian philosophical tradition in the centuries after his lifetime and continued to dominate readings of the texts throughout the 19th and early 20th centuries. Śaṅkara was the main proponent of Advaita Vedānta, which put forth a position of non-dualism. According to Śaṅkara the fundamental teaching of the Upaniṣads is that ātman and brahman are one and the same.

For Śaṅkara, the Upaniṣads are not merely sources to back up his claims, but they also provide him with techniques for making his arguments. Śaṅkara takes the Upaniṣads as outlining methods for their own interpretation, following a number of literary criteria as clues for how to read the texts (Hirst 2005: 59-64). Consequently, even when he uses examples not found in the Upaniṣads, Śaṅkara can maintain that his arguments are based on scripture, for as long as he argues in the same way that the Upaniṣads do,  he can claim that his arguments are based on his sources.

Despite the significance of Śaṅkara’s philosophy, it is important to note that his interpretation of the Upaniṣads was not the only one accepted by philosophers of the Vedānta school. Rāmānuja (c. 1000 C.E.), the main proponent of a form of Vedānta known as Viśiṣtādvaita, or qualified non-dualism, used the Upaniṣads to argue that ātman is not identical with brahman, but an aspect of brahman. Rāmānuja also found in the Upaniṣads a source for bhakti, as he identified the Upanishadic brahman with God. Two centuries later, Madhva (c. 1200 C.E.) used the Upaniṣads as a source for a dualist branch of the school, known as Dvaita Vedānta. Madhva interpreted brahman as an infinite and independent God, with the self as finite and dependent. As such, ātman is dependent upon brahman, but they are not exactly the same.

It is well known that the Vedānta school became extremely influential in shaping subsequent philosophical debates, and we may conjecture that the tendency for various Vedānta philosophers to use the Upaniṣads in support of their own positions, as well as in their criticisms of rival schools, prompted other schools to engage with the Upaniṣads more closely. This is illustrated by the fact that schools such as Nyāya and Sāṃkhya, which previously seem to have relied very little on the Upaniṣads, began invoking them to counter the claims of Advaita Vedānta.

The Nyāya philosopher Bhāsarvajña (c. 850-950 CE), for example, quotes some verses from the Upaniṣads to support his position of a distinction between the ordinary and supreme sense of self when arguing with the Advaita position of non-dualism. Another Nyāya philosopher, Gaṅgeśa (c. 1300 C.E.), seems to be quoting from the Upaniṣads to back up the claim that karmic retribution is not binding for those who know the self—a position stated by Yājñavalkya (BU 4.4.23). Moreover, a number of Sāṃkhya and Yoga philosophers use the Upaniṣads in an attempt to make their schools more compatible with Vedānta. The Sāṃkhya philosopher Nāgeśa (c. 1700-1750), for example, draws from the Upaniṣads—as well as the other two source texts of the Vedānta school, the Bhagavad Gītā and Brahma Sūtra—to argue that the Vedānta and Sāṃkhya schools do not contradict each other. This trend can also be found in the Sāṃkhyasūtra (c. 1400-1500 C.E.), which argues that the identification of brahman and ātman was a qualitative identity, but not a numerical one—seemingly defending Sāṃkhya against Śaṅkara’s criticism that the Sāṃkhya doctrine of multiple selves contradicts the Upaniṣads. Interestingly, this argument suggests that Sāṃkhya philosophers not only felt the need to show that their positions did not contradict the Upaniṣads, but also that they basically accepted the Advaita Vedānta reading of the Upaniṣads.

9. The Upaniṣads as Philosophy

As noted above, many of the Upaniṣads are composite and fragmented, and therefore lacking a coherent philosophical position. Moreover, the teachers portrayed in the Upaniṣads do not seem to make linear arguments that start with premises and build to larger conclusions, but rather tend to make points through analogies and metaphors, with many core ideas presented as truths or insights known to particular teachers, not as logical propositions that can be independently verified. Nonetheless, in a number of sections of the texts, there appear to be implicit philosophical methods in place.. We have already noted that Yājñavalkya’s discussion of the self is based on a reflective introspection (see also MuU 3.1.8-9). The early Upaniṣads do not contain passages explicitly articulating method, but with the development of yoga and meditation in the later texts, introspection begins to be formalized as a philosophical mode of enquiry.  Also, many of Uddālaka Āruṇi’s descriptions of ātman are derived from his observations of the natural world.

In addition to providing a repository of terms, concepts, and, to a certain degree, philosophical methods, from which subsequent philosophical schools would draw, the Upaniṣads were also influential in the development of the practice of debates, which would become the defining social practice of Indian philosophy. Although the texts do not discuss debate reflectively, a number of the most important teachings are articulated within the context of discussions between teachers and students, and verbal disputes among rival brahmins. In some dialogues, there is a dialectical relationship between the arguments of competing interlocutors, indicating that the dialogical presentation of teachings was a way of formulating philosophical rhetoric (Black 2015). In this way, debate is another way by which the Upaniṣads extend ideas first articulated in the context of the Vedic ritual into a more philosophical discourse.

10. The Upaniṣads in the Modern Period

The Upaniṣads are some of the most well-known Indian sources outside of India. Their first known translation into a non-Indian language was initiated by the Mughal prince Dārā Shūkōh, son of the emperor Shah Jahan. This Persian translation, known as the Sirr-i Akbar (the Great Secret), consisted of fifty texts, including the Vedic upaniṣads, many of the yoga, renunciate, and devotional upaniṣads, as well as other texts, such as the Puruṣa Sūkta hymn of the Ṛgveda and some material from unidentified sources. Dārā Shūkōh considered the Upaniṣads to be the sources of Indian monotheism and he was convinced that the Koran itself referred to the Upaniṣads.

Henry Thomas Colebrooke’s translation of the Aitareya Upaniṣads in 1805 was the first rendering of an upaniṣad into English. Rammohan Roy subsequently translated the Kena, Īśā, Kāṭha, and Muṇḍaka Upaniṣads into English, while his Bengali translation of the Kena Upaniṣad in 1816 was the first rendering of an upaniṣad into a modern Indian language.

Roy used the introductions of his translations into both Bengali and English to promote the reformation of Hinduism, endorsing the values of reason and religious tolerance, while criticizing practices such as idolatry and caste hierarchy. Roy felt that contemporary religion in India was in decline and hoped that his translations could provide Hindus with direct access to what he considered to be the true doctrines of Hinduism. The Upaniṣads first reached Europe in the modern period through the French philologist Abraham Hyacinthe Anquetil-Dupperon’s translation of the Sirr-i Akbar into Latin, which was published in 1804. It was Anquetil-Dupperon’s text, known as the Oupnek’hat. which was read by the German philosopher Arthur Schopenhauer, the first major European thinker to engage explicitly with Indian sources. Schopenhauer considered the Upaniṣads, Plato, and Kant to be the three major influences on his work and is known to have kept a copy of Anquetil-Dupperon’s translation by his bedside table, reflecting that the Upaniṣads were his consolation in life and would equally be his consolation in death.

11. References and Further Reading

a. Primary Sources

  • Buitenen, J. A. B. van, tr. 1962. The Maitrāyaṇīya Upaniṣad: A Critical Essay with Text. Translation & Commentary. The Hague: Mouton & Co.
  • Deussen, Paul, tr. 1980 (originally published in 1897). Sixty Upaniṣads of the Veda, translated by V. M. Bedekar and G. B. Palsule. Delhi: Motilal Banarsidass.
  • Eggeling, Julius, tr. 1994 (originally published in 1882-97). Śatapatha Brāhmaṇa, Vols. 12, 26, 41, 43 and 44 (5 parts of the Sacred Books of the East; Delhi: Motilal Banarsidass).
  • Hume, Robert, tr. 1975 (originally published in 1921). The Thirteen Principal Upanishads. London: Oxford University Press.
  • Keith, A. B., tr. 1995 (originally published in 1909). Aitareya Āraṇyaka. London: Oxford University Press.
  • Müller, F. Max, tr. 2000 (originally published in 1897). The Upanishads Parts 1-2. Delhi: Motilal Banarsidass.
  • Olivelle, Patrick, tr. 1992. Saṃnyāsa Upaniṣads. New York: Oxford University Press.
  • Olivelle, Patrick, tr. 1996. The Upaniṣads. Oxford: Oxford University Press.
  • Olivelle, Patrick, tr. 1998. The Early Upaniṣads: Annotated Text and Translation. New York: Oxford University Press.
  • Oertel, H., tr. 1897. ‘The Jaiminīya or Talavakāra Upaniṣad Brāhmaṇa’. Journal of the American Oriental Society 16: 79-260.
  • Radhakrishnan, Sarvepalli, tr. 1992 (originally published in 1953). The Principal Upaniṣads. New Jersey: Humanities Press.
  • Roebuck, Valerie, tr. 2004. Upaniṣads. Harmondsworth: Penguin.

b. Secondary Sources

  • Black, Brian. 2007. The Character of the Self in Ancient India: Priests, Kings, and Women in the early Upaniṣads. Albany: State University of New York Press.
  • Black, Brian. 2011. “Ambaṭṭha and Śvetaketu: Literary Connections between the Upaniṣads and Early Buddhist Narratives.” Journal of the American Academy of Religion, Vol. 79, No. 1: 136–161
  • Black, Brian. 2011. “The Rhetoric of Secrecy in the Upaniṣads.” Essays in Honor of Patrick Olivelle, edited by Steven Lindquist. Florence: Florence University Press: 101-125.
  • Black, Brian 2015. “Dialogue and Difference: Encountering the Other in Indian Religious and Philosophical Sources.” Dialogue in Early South Asian Religions: Hindu, Buddhist, and Jain Traditions, edited by Brian Black and Laurie Patton Farnham, UK: Ashgate: pp. 243-257.
  • Brereton, Joel. 1990. “The Upanishads.” Approaches to the Asian Classics, edited by Wm. T. de Bary and I. Bloom. New York: Columbia University: 115-135.
  • Cohen, Signe. 2008. Text and Authority in the Older Upaniṣads. Leiden: Brill.
  • Deussen, Paul. 2000 (originally published 1919). The Philosophy of the Upanishads. Delhi: Oriental Publishers.
  • Ganeri, Jonardon. 2007. The Concealed Art of the Soul: Theories of Self and Practices of Truth in Indian Ethics and Epistemology. Oxford: Oxford University Press.
  • Hirst, J. G. Suthren. 2005. Śaṃkara’s Advaita Vedānta: A Way of Teaching, London: RoutledgeCurzon.
  • Killingley, Dermot. 1997. “The Paths of the Dead and the Five Fires.” Indian Insights: Buddhism, Brahmanism and Bhakti: Papers from the Annual Spalding Symposium on Indian Religions, edited by Peter Connolly and Sue Hamilton. London: Luzac Oriental.
  • Lindquist, Steven. 2008. “Gender at Janaka’s  Court: Women in the Bṛhadāraṇyaka Upaniṣad Reconsidered.” Journal of Indian Philosophy. Vol. 36, No. 3: 405-426.
  • Lindquist, Steven. 2011. “Literary Lives and a Literal Death: Yājñavalkya, Śākalya, and an Upaniṣadic Death Sentence.” Journal of the American Academy of Religion. Vol. 79, No. 1: 33-57.
  • Olivelle, Patrick. 1999. “Young Śvetaketu: A Literary Study of an Upaniṣadic Story.” Journal of the American Oriental Society. Vol. 119, No.1: 46-70.
  • Olivelle, Patrick. 2009. “Upaniṣads and Āraṇyakas.” Brill’s Encyclopaedia of Hinduism, edited by Knut Jacobsen. Leiden: Brill 41-55.
  • Olivelle, Patrick. 2012. “Kings, Ascetics and Brahmins: the Socio-Political Context of Ancient Indian Religions.” Dynamics in the History of Religions between Asia and Europe, edited by Volkhard Krech and Marion Steinicke. Leiden: Brill: 117-136.
  • Patton, Laurie. 2004. “Veda and Upaniṣad.” The Hindu World, edited by Sushil Mittal and Gene Thursby. London: Routledge: 37-51.
  • Thapar, Romila. 1993. “Sacrifice, Surplus, and the Soul.” History of Religions. Vol. 33, No. 4: 305-324.
  • Witzel, Michael. 2003. “Vedas and Upaniṣads.” The Blackwell Companion to Hinduism, edited by Gavin Flood. Oxford: Blackwell Publishing: 68-98.

 

Author Information

Brian Black
Email: b.black@lancaster.ac.uk
Lancaster University
United Kingdom

Totalitarianism

Totalitarianism is best understood as any system of political ideas that is both thoroughly dictatorial and utopian. It is an ideal type of governing notion, and as such, it cannot be realised perfectly.

Faced with the brutal reality of paradigmatic cases like Stalin’s USSR and Nazi Germany, philosophers, political theorists and social scientists have felt not just intellectually motivated but morally compelled to explain the causes and implications of totalitarianism. This has been in part an attempt to explain the socio-political phenomenon in itself, as well to develop an intellectual tool in the arsenal of democracy.

Diverse philosophical perspectives have been employed. They share the important common denominator of an appeal to the value of human life, critical thought, and a pluralistic society. Many of the key figures among the anti-totalitarian thinkers discussed here were European Jewish refugees who escaped totalitarian systems. Many who work on this question have been motivated by a desire to come to grips, philosophically, with what is undoubtedly the greatest intellectual justification for mass murder in history: the twentieth century totalitarian state.

Table of Contents

  1. Introduction
  2. Second World War and Cold War Thought
    1. The American Pragmatists on the Values of Pluralism and Democratic Debate
      1. John Dewey on Democratic Method
      2. Sidney Hook on Heresy versus Conspiracy
    2. The British Liberal Defence of the Open Society and Pluralism
      1. Karl Popper’s Indictment of Historicism
      2. Isaiah Berlin on Liberty
      3. Jacob Talmon on Totalitarian Democracy
    3. Hannah Arendt on the Origins and Implications of Totalitarianism
    4. Erich Fromm on Escaping from Freedom: A Psychoanalytical Approach
  3. Later Work
    1. Judith Shklar’s Liberalism of Fear
    2. Avishai Margalit on the Decent Society and Totalitarianism
  4. References and Further Reading

1. Introduction

The term “totalitarianism” dates to the fascist era of the 1920s and 1930s, and it was first used and popularised by Italian fascist theorists, including Giovanni Gentile. It progressively came to be extended to include not just extreme utopian dictatorships of the far right, but also Communist regimes, especially that of the Soviet Union under Joseph Stalin. It is still frequently associated with Cold War thought of the 1940s and 1950s, a period during which it was most widely utilised as a governing concept, although its philosophical implications transcend that era’s political fears and rhetoric. As used in this article, “totalitarianism” will refer to the most extreme modern dictatorships possessing perfectionistic and utopian conceptions of humanity and society.

Totalitarianism’s appeal is linked to a variety of perennial values and intellectual commitments. Although a distinctly modern problem, proto-totalitarian notions may be found in a variety of philosophical and political systems. In particular, Plato’s utopian society discussed in the Republic featured a caste-based society in which both social and moral order are to be maintained and fostered through strict political control and eugenics.

In the seventeenth century, absolutists and royalists such as Thomas Hobbes and Jacques Bossuet advocated, in various ways, a strong centralized state as a guarantor against chaos in conformity with natural law and biblical precedent. However, it was only in the early twentieth century that totalitarianism, properly understood, became a conceptual and political reality. Thinkers as diverse as Carl Schmitt in Germany and Giovanni Gentile in Italy helped to lay the foundations of fascist ideology, stressing the defensive and unifying advantages of dictatorship. In the nascent USSR, Vladimir Lenin developed Marx’s ideas from a potentially totalitarian base into a full blown communist ideology, in which Marx’s own phrase “the dictatorship of the proletariat” was interpreted explicitly to mean the dictatorship of the Soviet Communist Party.

The term “totalitarianism” is also sometimes used to refer to movements that in one way or another manifest extreme dictatorial and fanatical methods, such as cults and forms of religious extremism, and it remains controversial in scope. It has been a topic of interdisciplinary interest, with various typologies offered by political scientists (see Friedrich and Brzezinski 1956 for the locus classicus of such approaches).

This article will primarily examine some key models and criticisms of the problem of totalitarianism defended by preeminent philosophers, as well as the thoughts of some key and representative scholars in other disciplines whose work is of philosophical significance. Their perspectival range encompasses strongly liberal, intellectual historical, neo-Marxist and pragmatist approaches. All have wished to distinguish totalitarianism sharply from liberal democratic ideals and society.

2. Second World War and Cold War Thought

a. The American Pragmatists on the Values of Pluralism and Democratic Debate

It is by no means surprising that American pragmatists should have responded to the challenge of totalitarianism in the mid-twentieth century. Not just Cold War realities, but philosophical method and values were key factors in this response. Given its strong emphasis on experimental method and the value of individual experience and fallibilism in epistemology, pragmatism would seem prima facie inimical to dictatorship.

i. John Dewey on Democratic Method

Philosophy, in order to be at its best, requires both critical thinking and democratic action, on any interpretation of Dewey’s pragmatism. In a number of works published between the 1930s and his death in 1952, John Dewey felt compelled to defend democracy against the growth and expansionism of totalitarianism, and this engagement was in keeping with Dewey’s passion for social activism and public education over the course of his long life. Dewey’s action on this matter included chairing the 1937 Dewey Commission that critically examined Soviet charges against Leon Trotsky.

Dewey had been interested in the problems of democracy for some time when he wrote his 1939 democratic credo I Believe. The rapid expansion of fascism and the Soviet Great Purge of the mid to late 1930s alerted Dewey to imminent threats to individual freedom from diverse quarters. In this short work, Dewey stated that he felt compelled to emphasize the fundamental value and importance of individuals over the state in the face of creeping totalitarianism. He here affirmed the pragmatist conviction that experience and institutions tempered by democratic problem solving ought to be primary in social philosophy. Dewey held that such problem solving, in order to be ethically compelling, must be respectful of the fundamental primacy of individual rights. It must furthermore involve an important element of negotiation and compromise over dogmatic assertion.

Furthermore, Dewey held that the rise of modern dictatorships was in part a reaction to an excessive form of individualism that isolated human beings from each other, and that offered only modern capitalism in mass society as a choice:

The negative and empty character of this individualism had consequences which produced a reaction toward an equally arbitrary and one-sided collectivism. This reaction is identical with the rise of the new form of political despotism. The decline of democracy and the rise of authoritarian states which claim they can do for individuals what the latter cannot by any possibility do for themselves are the two sides of one and the same indivisible picture.

Political collectivism is now marked in all highly industrialized countries, even when it does not reach the extreme of the totalitarian state….[the individual] is told that he must make his choice between big industry and finance and the big national political state. (Dewey, 1993: 235-236).

ii. Sidney Hook on Heresy versus Conspiracy

Sidney Hook was Dewey’s prime disciple in the application of pragmatism to anti-totalitarian thought. In his highly controversial 1953 book, Heresy, Yes—Conspiracy, No, Hook incurred the allegation of McCarthyism due to his advocacy of a firm line against the American Communist Party, especially within academia and educational trade unions.

Hook, who was social democratic for much of his career, distinguished between a genuinely progressive left that operates in a heretical and democratic matter, and the Stalinist American Communist Party and its fellow travellers. Heresy, for Hook, is an entirely legitimate expression of dissent on controversial matters. However, he held the Communist movement to be inherently conspiratorial and subversive of the very ground rules of democracy, and this led him to advocate restrictions against its carrying out policies and actions inimical to elected government. In effect, Hook affirmed the legitimacy of democracy protecting itself not just from external aggression, but from internal subversion in the interest of foreign aggressors, such as the USSR. He took this to be in keeping with the pragmatist emphasis on democratic consensus and open debate in the interest of solving social problems, a methodology diametrically opposed to Stalinism.

Hook’s core thesis of muscular liberalism is powerfully stated in a New York Times Magazine article subsequently expanded into a 1953 book:

Liberalism in the twentieth century must toughen its fibre, for it is engaged in a struggle on many fronts. Liberalism must defend the free market in ideas against the racists, the professional patrioteer, and those spokesmen of the status quo who would freeze the existing inequalities of opportunity and economic power by choking off criticism.

Liberals must also defend freedom of ideas against those agents and apologists of Communist totalitarianism, who, instead of honestly defending their heresies, resort to conspiratorial methods of anonymity and other methods of fifth columnists. (Hook, 1950: 143).

The usual objections to pragmatism are pertinent to its Deweyan anti-totalitarian strain. These revolve around the claims that pragmatism has an insufficiently robust and general conception of truth and evidence to serve as an adequate foundation for ethical and political principles. Ethical foundationalists in particular, have rejected pragmatism as possessing excessively relativistic implications, and for lacking a strong sense of moral tradition.

Contemporary pragmatists have, in different ways, attempted to respond to such criticisms by stressing the great value of democratic society in upholding value pluralism and open-ended inquiry:

…democracy is not just one form of social life among other workable forms of social life; it is the precondition for the full application of intelligence to the solution of social problems. (Putnam, 1992: 180).

 Whether or not pragmatist anti-totalitarianism succeeds in its defence of democracy and individual rights is thus deeply linked to the coherence and adequacy of pragmatist defenses of a fallibilistic and at times flexible conception of truth in ethics and politics. If there is no need for traditional ethical foundationalism in upholding the value of democracy against tyranny, then the pragmatist case against totalitarianism may be seen to be a serious methodological option.

b. The British Liberal Defence of the Open Society and Pluralism

Although both Karl Popper and Isaiah Berlin were born outside of Great Britain, they were both leading theorists of anti-totalitarianism in British academia. The Israeli scholar, Jacob L. Talmon, was British trained, and is best seen as applying the British liberal tradition to the Enlightenment. There are clear affinities between their positions on this issue, which are best seen as continuations of the British liberal tradition well into the twentieth century, when it faced the challenge of the totalitarian state. The three representatives of British liberalism discussed here shared a commitment to individual liberty, wariness of state power, and an evident suspicion of what they took to be the collectivist and utopian excesses of various Continental thinkers.

i. Karl Popper’s Indictment of Historicism

In several works, Karl Popper articulated a vigorous defence of liberal democracy over dictatorship. In his early work there is a particular emphasis on the unscientific and ultimately illogical character of all forms of historical determinism and collectivism. In The Poverty of Historicism, he stressed the philosophical errors of utopianism, and what he termed “historicism”—assuming or attempting to argue for the existence of deterministic historical laws, and the possibility of deriving accurate predictions from them. These predictions are purportedly scientific or metaphysical, and for Popper, they betray an epistemic confusion between falsifiable and limited predictions based on evidence, and “oracular prophesies” masquerading as science or philosophical rationality.

In keeping with his philosophy of natural science, Popper urges us to shun certainty and dogmatism in social science and history, in favour of a piecemeal approach characterised by attention to particulars and the trial and error methods of fallibilism. Such an approach is not only conducive to precise and clear social explanations; Popper defends it as a philosophical shield against tyranny as well. For it is precisely the immodesty of overgeneralising to alleged rigid laws in history that has led even great philosophers and other thinkers to commit the error of historicism, which is a key component of totalitarian and fanatical patterns of thought.

Popper defines “historicism” as a theory of history that affirms the existence of deterministic laws from which iron-clad predictions can be derived. He thus accuses purportedly scientific theorists of history, including Karl Marx, of misinterpreting trends as inexorable laws, thereby producing unscientific and potentially irrational schemes of historical development. When coupled with grandiose or holistic schemes of social engineering, such approaches, for Popper, combine bad social science with lethal utopianism. We ought, he claims, to opt for “piecemeal engineering” employing trial and error experimentation, openness to constructive criticism, and the falsification of our programs:

[commitment to holistic or utopian social engineering] prejudices the Utopianist against certain sociological hypotheses which state limits to institutional control….problems connected with the uncertainty of the human factor must force the Utopianist, whether he likes it or not, to try to control the human factor by institutional means, and to extend his programme, so as to embrace not only the transformation of society, according to plan, but also the transformation of man. (Popper, 1960: 69-70).

Although written slightly later than The Poverty of Historicism, Popper’s The Open Society and its Enemies was published during the Second World War. It is therefore best seen as an intellectual contribution to the Allied cause against fascism, which was subsequently readily adapted to the struggle against Soviet dictatorship during the Cold War. Both works are permeated by a sense that democracy was under fire and could potentially be annihilated by its totalitarian rivals.

Here Popper broadens his critique of totalitarianism by indicting major figures of the Western philosophical tradition, notably Plato, Hegel and Marx. All three, he held, were guilty of collectivist and utopian social projects. In diverse ways, Plato’s notion of guardianship and the philosopher kings, Hegel’s glorification of the militaristic nation state, and Marx’s belief in the inevitability of class warfare and violent revolution all share a misguided common denominator: the historicist belief in holistic explanations derived from alleged laws of historical inevitability. In place of this, Popper recommended a non-dogmatic “critical rationalism,” within an open society that respects debate and a quest for truth and knowledge. This method ought to at all costs be substituted for historicist and utopian grand schemes of social science and philosophy of history that are characterised by a kind of oracular faith in their own future prophesies, dogmatism, and immunity to falsification.

Popper explained the appeal of historicism as a product of a false conception of the power of social science and historiography, combined with alienation and dissatisfaction:

Why do all these social philosophies support the revolt against civilization? And what is the secret of their popularity? Why do they attract and seduce so many intellectuals? I am inclined to think that the reason is that they give expression to a deep felt dissatisfaction with a world which does not, and cannot, live up to our moral ideals and to our dreams of perfection. The tendency of historicism (and of related views) to support the revolt against civilization may be due to the fact that historicism itself is, largely, a reaction against the strain of our civilization and its demand for personal responsibility. (Popper, 2011: xxxix).

Popper’s faith in rationalism and the open society has been criticised by Leszek Kołakowski for not taking into account democracies’ propensity towards self-destruction. Kolakowski holds that the diverse ends of open societies can come into conflict with each other, thereby vitiating attempts to combine liberal values coherently. He writes of Popper’s model:

The open society is described less as a state constitution and more as a collection of values, among which tolerance, rationality, and a lack of commitment to tradition appear at the top of the list. It is assumed, naively so I think, that this set is wholly free of contradictions, meaning that the values that it comprises support each other in all circumstances or at least do not limit each other. (Kołakowski, 1990: 164).

This criticism points to the question of value pluralism as discussed by Isaiah Berlin: how can a multiplicity of values, some of them potentially mutually exclusive, provide a coherent and adequate buffer against repressive, totalitarian state power?

ii. Isaiah Berlin on Liberty

Throughout his career, Isaiah Berlin devoted a considerable amount of attention to the question of totalitarianism. He saw it as one of the most important features of twentieth century history, and as the logical outcome of an excessive devotion to what he took to be a dangerously paternalistic conception of liberty.

In a key work on the subject (1969, reprinted and expanded in 2002), Berlin drew an important distinction between the negative and positive conceptions of liberty or freedom:

The first of these political senses of freedom or liberty…which (following much precedent) I shall call the “negative sense,” is involved in the answer to the question “What is the area within which the subject—a person or group of persons—is or should be left to do or be what he is able to do or be, without interference by other persons?” The second, which I shall call the “positive” sense, is involved in the answer to the question ‘What, or who, is the source of control or interference that can determine someone to do, or be, this rather than that?’ The two questions are clearly different, even though the answers to them may overlap. (Berlin, 2002: 169).

He thus held that the former is the foundation of the pluralistic liberalism that he wished to defend, and that the latter is a very different notion, involving obligatory self-realisation through the perfection of the individual and society in accordance with natural or historical necessity. Whereas negative liberty is a cornerstone of toleration, openness to new knowledge and individual rights, positive liberty, for Berlin, is the state’s paternalistic high road to totalitarianism.

Long associated with despotic and dictatorial regimes, positive freedom had, by the mid-twentieth century, formed part of the justification for both communist and fascist dictatorships. By claiming deterministic justifications including a truly scientific conception of historical law, social Darwinism or the will of the people, totalitarian states of both the extreme left and the extreme right justified the murder of millions in the name of a unitary and static utopian future that they saw as set and predictable.

For Berlin, this totalitarian development of positive liberty was not an aberration, but a logical conclusion. It emerged in a particularly lethal form in the twentieth century due to its central role in the justification of illiberal and non-humanistic ideologies, including communism, fascism, and the sort of extreme romantic nationalism and clericalism already present prototypically in the thought of nineteenth century figures such as Joseph de Maistre.

Against this, Berlin urged humanity to seek a decent society with pluralistic values, thus eschewing utopian perfectionism. This he thought to be characterised by a fallibilistic conception of knowledge, peaceful trade-offs, and the rejection of nihilism and relativism in favour of common values across genuinely diverse ways of life. Such a society would, he held, resolve to maintain a pluralistic balance of values against any and all attempts to sacrifice entire groups of people in the name of a future that can never be fully predicted.

A key criticism of a stark division between negative and positive liberty has been offered by Charles Taylor (1985). He claims that the terms have been used in an excessively narrow way so as not to do justice to the complexity of human freedom. In particular, the existence of what he has termed “strong evaluations” (Taylor, 1985: 220). That is, important qualitative distinctions in the ranking of individuals’ desires and projects, would seem to render incomplete any use of the idea negative freedom as essentially a lack of coercion or obstacle. For Taylor, this conception of negative liberty stems from diverse and likely parallel sources in the Western philosophical tradition, such as Hobbes and Bentham. He claims that in order to do justice to freedom, even sophisticated liberals such as Mill have made significant use of concepts of self-development and improvement, and this implies some degree of positive liberty. So positive liberty is best understood as a part of individual freedom and flourishing, and not necessarily a component of totalitarianism.

The extent to which the state should promote it remains an important question. Understood along the lines indicated by Taylor, it may be a value to be realized through self-development in a more democratic society. This is in keeping with what not only Taylor, but other thinkers, claim.

iii. Jacob Talmon on Totalitarian Democracy

In 1952, Jacob L. Talmon published a liberal indictment of those views of eighteenth century thought that saw the French Enlightenment as manifesting overwhelmingly liberal tendencies.

Talmon argued, in The Origins of Totalitarian Democracy, that both liberal-empirical and totalitarian tendencies were significant and influential in European thought by the time of the French Revolution. In particular, he held that key aspects of the thought of Jean-Jacques Rousseau and lesser known radical Babouvist egalitarian Enlightenment figures such as Gabriel Bonnot de Mably, and Étienne-Gabriel Morelly, are best seen as a foreshadowing of twentieth century totalitarianism.

Like Berlin, Talmon stresses the fundamental divergence between individualist and collectivist or statist conceptions of freedom. He divided early modern democratic thought into two broad categories: “liberal” and “totalitarian” democracy. The former led, through a long process of parliamentary development across the nineteenth century, to the institutions regarded as democratic in the mid-twentieth century. The liberal democratic thought of Benjamin Constant and Alexis de Tocqueville in France, as well as John Stuart Mill in England, were instrumental in developing this political tradition to a philosophical apogee. Talmon traced its origins in part to John Locke’s defense of individual property rights. Totalitarian democracy, on the other hand, developed largely from radical French Enlightenment thought through Babeuf and the Jacobin stream of the French Revolution, and through nineteenth and early twentieth century Marxism. Talmon describes it as a form of “political Messianism.”

Liberal democracy has stressed the importance of individual human rights, empiricism and the rule of law from its beginnings. It advocates piecemeal reform and the application of rationality to arrive at optimal political remedies to social problems. Totalitarian democracy from Robespierre and the Jacobins through Karl Marx and into the twentieth century has been utopian, collectivist and statist. Talmon furthermore holds it to be characterised by historical determinism and a notion of a single comprehensible truth in political life.

The two intellectual tendencies both claim to promote freedom to the highest degree, but differ greatly in their conceptions of legitimate freedom. Both schools affirm the supreme value of liberty, but whereas the one finds the essence of freedom in spontaneity and the absence of coercion, the other believes it to be realized only in the pursuit and attainment of an absolute collective purpose. Liberal democrats believe that, in the absence of coercion, men and society may one day reach through a process of trial and error a state of ideal harmony. In the case of totalitarian democracy, this state is precisely defined, and is treated as a matter of immediate urgency, a challenge for direct action, an imminent event:

[Human beings,] in so far as they are at variance with the absolute ideal they can be ignored, coerced or intimidated into conforming, without any real violation of the democratic principle being involved. (Talmon, 1986: 2-3).

Talmon devotes considerable attention to what he takes to be Rousseau’s totalitarian tendencies in The Social Contract. Talmon finds especially collectivist Rousseau’s notion of the “general will” being over and above society and representing the highest aspirations of humanity. Furthermore, the idea that the individual can only find true liberation through the state and its supreme “Legislator” is the high road to dictatorship, for Talmon. Rousseau is thus seen as a merciless collectivist, willing to “force people to be free” in order to create a new and perfected type of human being. This ideal involves a notion of democracy as the constant and unanimous participation of the citizens of an ideal state in the acting out of the general will, thereby realising true democratic citizenship.

Talmon’s conception of the origins of totalitarianism in the French Enlightenment and its revolutionary heritage has been challenged on various grounds. The Canadian scholar C.B. Macpherson, influenced by Marxism, argued that Talmon erred in stressing ideas over class and social realities, and in thus making too strong a causal claim in linking notions of natural order and political unanimity to inevitable totalitarianism. Furthermore, he claimed that the Jacobins instituted a type of early totalitarian rule largely in response to the social pressures of revolutionary power and foreign counter-revolutionary invasion.

In effect, the true causes of historical change are thus seen as grounded in class and general social trends, and not merely in purely philosophical or ideological causes. This criticism implies holds that understanding key ideas and movements requires an understanding of their class background:

A petit-bourgeois movement like Jacobinism, or a proletarian movement still based on the same individualist assumptions (like Bavouism) is particularly liable to demand a completely general unanimity at a time when it is least possible. It might be argued that it was the petit-bourgeois character of these ideologies, rather than the assumption of a natural order, that led so readily to totalitarian dictatorship. (Macpherson, 1952: 57).

This criticism of Talmon’s core thesis bears affinities with a critique of Arendt, and it raises the general question of the social causation of ideas in an interesting way. To what extent are philosophical ideas responsible solely, or at least primarily, for mass movements throughout history, including totalitarianism? If Talmon and Arendt are right, they certainly possess sufficient causal potency to be determining factors in social and political development. If their critics hold the high ground, they have inflated the importance of secondary or even epiphenomenal notions and properties to an unrealistic station.

c. Hannah Arendt on the Origins and Implications of Totalitarianism

In her seminal 1951 book, Hannah Arendt attempted to show how totalitarianism emerged as a distinctly modern utopian problem in the twentieth century, growing out of a lethal combination of imperialism, anti-Semitism and extreme statist bureaucracies. As much a work of intellectual history as political philosophy, The Origins of Totalitarianism jarred many due to its indictment of European civilization during a period of post-war reconstruction. Arendt held that totalitarianism was not a reactionary aberration, an attempt to turn back the clock to earlier tyrannies, but rather a revolutionary form of radical evil explicable by particularly destructive tendencies in modern mass politics. The atomisation of lonely individuals and the receptivity to propaganda of mass society in the modern age makes it an ongoing temptation to be resisted through critical thinking and the affirmation of fundamental human values.

Tracing what she took to be the prime causes of totalitarianism to the nineteenth century, Arendt focused on the rise of imperialism and political anti-Semitism, and the concomitant decline of both the remnants of the feudal order and the nation state. Imperialism and anti-Semitism both drew from racist and Social Darwinist wellsprings in their repudiation of unity through language, culture, and universal rights in favour of biologically fixed and hierarchical distinctions within humanity and a struggle for world conquest. The consequent de-humanisation of entire races and ethnic groups in favour of Aryanist ideals set the grounds for fascism, with the enthusiastic support of what Arendt termed “the mob,” that is, the resentful European déclassés. Furthermore, the narrow chauvinism of pan-Slavism coupled with notions of class warfare and annihilation paved the way for a parallel communist regime of terror in the Soviet Union.

Arendt held that in both its fascist and communist varieties, the totalitarian system’s terror is not incidental, but essential. Unlike authoritarian dictatorships that strive to uphold conservative values, such regimes by their very nature aim to destroy civil society and tradition in favour of a utopian re-fashioning of humanity to suit their collectivist ideological purposes. The twentieth century totalitarian state thus emerges as a juggernaut of terror, a terror maintained in no small part by the eradication of fundamental human values and all critical thought in favour of ideology and propaganda. It thereby seeks to destroy all communal and civil institutions between it and its atomised and lonely citizens. Arendt wrote:

The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction (that is, the reality of experience) and the distinction between true and false (that is, the standards of thought) no longer exist. (Arendt, 1968: 474).

A key challenge to Arendt’s analysis is shared with all such work on the frontier between political theory and intellectual history, namely its degree of empirical truthfulness and the precise accuracy of its causal explanations (Gleason, 1995). Establishing such causal connections requires the extensive use of detailed historical evidence, as well as the colligation of coexisting ideas upon which Arendt relied. So, the account is subject to the usual historiographical and logical criticisms concerning the possible gap between the causation of events and the correlation of trends.

For all of the considerable attention that The Origins of Totalitarianism attracted in 1951, it was in 1963 that Arendt was to produce one of the most controversial works ever written by a political philosopher. Eichmann in Jerusalem: A Report on the Banality of Evil did not merely generate much discussion; it produced an intellectual shock wave heard around the world that still reverberates:

Hannah Arendt’s Eichmann in Jerusalem was published fifty years ago….It’s hard to think of another work capable of setting off ferocious polemics a half century after its publication. (Lilla, 2013).

Arendt here developed and expanded her general conclusions on the Holocaust and fascist bureaucracy from a series of articles that she wrote on the Eichmann trial for The New Yorker.

She claimed that for all of his extreme evil, Eichmann was not a mysterious monster, neither in his overall demeanour nor in his political and moral psychology. His evil was as much a matter of consequences as of intent, and in fact his intentions emerged as mixed, during the trial before the Israeli court. Arendt did not claim, in her thesis of “the banality of evil” that Eichmann was entirely neutral in his managing of the Nazi’s final solution, as has been maintained. Rather, she saw him as a distinctly modern product of a totalitarian bureaucracy who at times was eager to implement Hitler’s genocide, but who also showed real tendencies towards narrow instrumental rationality, clichéd thought and speaking patterns, and superficial amorality. She was thus struck by his at times entirely average bearing and thought patterns throughout the trial, for all of the enormous evil that he perpetrated.

Furthermore, Arendt claimed that the Eichmann case confirmed her view that totalitarianism represents a gross perversion of fundamental civilised and ethical values in favour of mass bureaucracy, propaganda and thoughtlessness. Both perpetrators and victims of the Holocaust were thus corrupted through a process involving the malevolence and instrumental efficiency of the Nazis, as well as the activities of a collaborating minority in the ghetto police and Jewish Councils. This last point was to provoke particular discomfort and sheer hostility, giving Arendt a virtual pariah status, although later Holocaust historiography has placed the general problem of collaboration in a more balanced context.

For Arendt, Eichmann was as much a product of the worst possible tendencies of state bureaucracy as a creator of them. This bureaucratic context in no way exonerated him, as she was careful to indicate; she held his execution in 1962 to be justified, even though she thought that there was a strong case in international law for an international tribunal for the case, rather than the Israeli court. However, the bureaucratic framework of Eichmann’s crimes required a re-examination of what she held to be a misleading diabolical conception of evil. That there is a tension between this account and the notion of radical evil developed in The Origins of Totalitarianism seems clear. However, both works share the important common denominator of an indictment of totalitarian bureaucracies that render the unthinkable not just possible, but probable and even banal. In a very real sense, this is a more disturbing thesis than Arendt’s earlier conception of evil as radical or in no small part beyond rational explanation. If Arendt was right overall, totalitarianism is a constant threat in modern mass societies, and no complacency on the matter can be justified.

d. Erich Fromm on Escaping from Freedom: A Psychoanalytical Approach

Among the various attempts to apply psychoanalysis to the question of totalitarianism, Erich Fromm’s Escape from Freedom is conspicuous for its sustained argumentation and conceptual scope. Fromm’s thesis that there exists an “authoritarian character” was subsequently developed through empirical case studies by Theodor W. Adorno and his co-authors in their work, The Authoritarian Personality.

Fromm was a philosophically inclined sociologist, who drew from both the Freudian and Marxist traditions in elaborating an explanation of diverse social phenomena. This is apparent in his view that there exists what might termed a self-reinforcing causal mechanism between social processes and ideology, in which psycho-social factors are reinforced by belief systems, and vice versa.

Central to Fromm’s analysis is the notion that totalitarianism stems from several root causes linked to the full emergence of modern individualism in the aftermath of the Reformation. Medieval social psychology was strongly transcendental in its emphasis of the secondary character of secular authority under God, and thus it inhibited the development of the sense of loneliness and isolation that characterised Western history from about the sixteenth century onwards.

For Fromm, Protestantism stimulated the development of individualism in its stress on individual success and good works, dutiful submission to God, thrift, and a significant sphere for secular authority. A self-reinforcing causal mechanism became increasingly apparent, especially among the middle classes of modern capitalist society, as the new form of Christianity helped to create the modern individual, and was in turn strengthened by the resultant socio-economic psychology of modern European society.

However, there can be no turning back the clock according to Fromm. Rather, modern humanity must strive to encourage healthy life-affirming values and the expression of human freedom. This is best done by recognising, as a society, the values of love, spontaneity, and secure personal development.

Fromm proposes that the anxiety in isolated individuals, produced by the great burden to succeed demonstrably and without secured grace in the eyes of God, led to severe social and psycho-pathologies. In particular, collectivist ideologies, including totalitarianism, emerged to satisfy the modern individual’s need for a sense of a higher purpose or calling:

It seems that nothing is more difficult for the average man to bear than the feeling of not being identified with a larger group….The fear of isolation and the relative weakness of moral principles help any party to win the loyalty of a large sector of the population once that party has captured the power of the state. (Fromm, 1969: 234).

A chilling picture thus emerges of an inherently alienated and insecure modern society that generates mass social movements of conformity. For Fromm, this was especially true of the German lower middle class, which he held to be strongly influenced by modern individualistic ideologies. He furthermore held that this class was the most alienated class in Germany, and thereby prone to a compensatory destructiveness, and that it was strongly characterised in Weimar Germany by a sense of having lost its legitimate class status. Thus the rise of Nazism had both important psycho-social and class factors, in his view.

Fromm analyses various “mechanisms of escape” by which the alienated seek relief from the burden of individual autonomy. Prime strategies, linked to totalitarianism’s appeal, are unthinking submission to the leader, and mindless conformity. The latter trend he saw not just in totalitarian society, but in capitalist democracies as well, and as requiring concerted social activism.

Both sadism and masochism are seen by Fromm as attempts to overcome feelings of individual powerlessness and meaninglessness. In politics, the authoritarian character is characterized by a slavish and nihilistic submission to authority, and a desire to have it over others. This character type, for Fromm, is the one most easily seduced by fascism.

If Fromm was correct in this, then the root causes of totalitarianism are both internal or psychological and external, in the form of trends in class relations and ideological evolution. The threat therefore remains nascent even in seemingly highly democratic modern societies, although Fromm did not advocate a relativism that would blur the lines between imperfect democracies and dictatorships.

Fromm, like Taylor, holds that positive notions of freedom can be of constructive value in counteracting political and social distortions and pathologies. In particular, a social democratic society that provides the individual with adequate resources and a sense of autonomous personal development can do much, he held, to reduce the appeal of totalitarian ideologies and to promote mental health and social ethics:

We must replace manipulation of men by active and intelligent cooperation, and expand the principle of government of the people, by the people, for the people to the economic sphere. (Fromm, 1969: 300).

Fromm’s analysis focussed considerably more on fascism than on communism. Its political diagnosis of Nazism, in particular, has been faulted even by sympathetic critics on several counts:

Fromm did not…treat the intensity of Hitler’s anti-Semitism, choosing instead to locate the Jew with the communist and the Frenchman as examples of Hitler’s purportedly “lesser” groups. Nor did Fromm point to the discredited Social Darwinist premises behind the Nazi quest for Aryan purity…. His hypothesis about the lower middle class has not held up. The Nazis gained votes from all classes. (Friedman, 2013: 113).

In his later work, Fromm extended his classic work on human aggression and destructiveness, providing psycho-biographies of totalitarian leaders such as Hitler, Himmler, and Stalin.

3. Later Work

a. Judith Shklar’s Liberalism of Fear

Throughout her work, the American political theorist Judith Shklar stressed the importance of seeing liberalism not as a utopian or perfectionistic ideal, but rather as a bulwark against tyranny and cruelty. In effect, she claimed that liberalism ought to be defined more by its opposition to oppression and nastiness than by anything else.

Shklar traces the roots of liberalism to the struggle for religious toleration in Reformation and Baroque Europe. In her model, a progressive consensus emerged in Western thought, holding that cruelty is supremely wicked. Early figures in this development include Montaigne and Montesquieu, whom Shklar contrasted with Machiavelli on this question.

This commitment to “put cruelty first” contributed greatly to the development of liberalism’s abhorrence of dictatorships of all kinds, including those of a modern totalitarian character. This implies an affirmation of memory over hope, and of sensitivity to the horrors of oppression over utopian aspiration. Not merely property rights, cultural pluralism, and the rule of law, but anti-tyranny first and foremost define the modern liberal perspective. If liberalism is rare historically and globally, this has more to do with the widespread character of cruel delusion than with any intrinsic defect on its part. For Shklar, we ought to remember at all costs the disastrous consequences of not putting cruelty first:

We must…be suspicious of ideologies of solidarity, precisely because they are so attractive to those who find liberalism emotionally unsatisfying, and who have gone on in our century to create oppressive and cruel regimes of unparalleled horror. (Shklar, 1998: 18).

Shklar’s negative liberalism has been criticised by Michael Walzer as setting reasonable anti-totalitarian boundaries for democratic action, while not recognising the importance of moving beyond them in the interest of social progress:

We always have to be afraid of political power; that is the central liberal insight. But this is an insight into a central experience that wasn’t discovered, only theorised by liberal writers. Nor does this fear by itself make for an adequate theory of political power. We must address the uses of power as well as its dangers. And since it has many uses, we must choose among them, designing policies, like Shklar’s guaranteed employment, that enhance and strengthen what we most value in own way of life. Then we try to enforce those policies, carefully if we are wise, remembering the last time we were fearful, and acting within the limits of liberal negativity. (Walzer, 1996: 24).

If this is correct, then the strong anti-totalitarianism of the liberalism of fear should be seen as setting boundaries against tyranny, rather than final limits to progressive social policy. Positive liberty is thereby affirmed, within strong democratic boundaries.

b. Avishai Margalit on the Decent Society and Totalitarianism

In reaction to the strong emphasis upon theories of justice in late twentieth century political thought, Avishai Margalit presented his case for the “decent society.” Such a society is, first and foremost, one that does not humiliate people. This means not treating human beings as less than human, as mere machines, animals, or inanimate objects. For Margalit, even if a society is just institutionally and procedurally, it may nonetheless denigrate its citizens and subjects in diverse institutional ways, thereby rendering it formally civilized but indecent. Without denying the value of social justice and the rule of law, Margalit has claimed that philosophy and political theory long neglected decency, which is every bit as important as justice. In so doing, they could not do justice to one of the main forms of oppression: institutional and state contempt for individuals.

In The Decent Society, Margalit contrasts totalitarian and gossip societies. Both these types of society are, for Margalit, indecent in not respecting individuals and their own legitimate social space. Gossip societies allow for a considerable range of imperfection, but lack decency in their absence of respect for privacy, and their non-institutional or cultural humiliation of alleged non-conformists.

In their radical perfectionism, totalitarian societies have no respect for individual privacy, and they systematically and institutionally obliterate communal and family structure between the individual and the state. Such societies’ regimes do everything within their considerable power to humiliate their subjects so as ultimately to perfect them, by recognising no legitimate private space, and by gathering sensitive information with which to blackmail and control them. They are thus agents of ultimate indecency, for Margalit.

Friendship among anti-totalitarian dissidents is thus especially valuable and intense, because of the potentially life and death solidarity that is generated by opposition to supreme state and bureaucratic indecency. The violation of such friendships by forcing dissidents to reveal sensitive information about others to the state is, for Margalit, one of the worst aspects of totalitarianism:

Totalitarian societies have proved to be a prescription for and guarantor of brave friendship, since friendships in regimes of this sort are conspiracies of humanity against the inhumanity of the regime. (Margalit, 1996: 210).

Margalit’s analysis provoked some re-examination of political and social philosophy’s focus on justice. In particular, the general core question of the balance to be struck between decency and justice raises fundamental questions about value priority:

…one might take the view that the best way for a society to strive to become decent is by promoting justice. By treating people in accordance with justice, society denies them one sound reason to feel rejected from humanity, however much they may actually feel that way….the decent and the just society may be too closely intertwined for us to be able to say that one or other has clear priority as an ideal. (Patten, 2001: 231).

Patten’s question reminds us of the extent to which justice and rights have a fundamental role in social and political values. It should be clear that Margalit in no way wishes to deny the value of justice. However, one may recall here Arendt’s thesis to the effect that totalitarianism arose in part not only because of an indecent Social Darwinism, but due to the repudiation of universal human rights. This may well be a strong challenge to attempts to reduce the firm priority of justice in political life.

4. References and Further Reading

  • Adorno, Theodor W., Frenkel-Brunswik, Else, Levinson, Daniel J., and Sanford, R. Nevitt. The Authoritarian Personality. Harper & Brothers, New York, 1950.
  • Arendt, Hannah. Eichmann in Jerusalem: A Report of the Banality of Evil. Penguin, New York, 2006.
  • Arendt, Hannah. The Origins of Totalitarianism. Harvest Books, Harcourt Brace Janovitch, San Diego, New York and London, 1968.
  • Arendt, Hannah. Between Past and Future: Six Exercises in Political Thought. Meridian Books, the World Publishing Company, Cleveland and New York, 1963. (Contains “What is Authority?”)
  • Berlin, Isaiah. The Crooked Timber of Humanity: Chapters in the History of Ideas. Edited by Henry Hardy. Pimlico, London, 2003a. (Contains “The Pursuit of the Ideal,” and “The Decline of Utopian Ideals in the West.”)
  • Berlin, Isaiah. Freedom and its Betrayal: Six Enemies of Human Liberty. Hardy, Henry (editor). Pimlico, London, 2003b.
  • Berlin, Isaiah. Liberty. Hardy, Henry (editor). Oxford University Press, Oxford and New York, 2002. (Contains several key essays in the section “Five Essays on Liberty.”)
  • Berlin, Isaiah. The Sense of Reality. Hardy, Henry (editor). Chatto and Windus, London, 1996. (Contains important essays such as “”The Sense of Reality,” “Political Judgement,” and “Philosophy and Government Repression.”)
  • Bossuet, Jacques. Politics Drawn from the Very Words of Holy Scripture. Cambridge University Press, Cambridge, 1999.
  • Cotter, Matthew J. (editor). Sidney Hook Reconsidered. Prometheus Books, Amherst, NY, 2004.
  • Dewey, John. The Political Writings. Morris, Deborah and Shapiro, Ian (editors). Hackett, Indianapolis and Cambridge, 1993. (ContainsI Believe”)
  • Friedman, Lawrence J. The Lives of Erich Fromm: Love’s Prophet. Columbia University Press, New York, 2013.
  • Friedrich, Carl J. and Brzezinski, Zbigniew K. Totalitarian Dictatorship and Autocracy. Harvard University Press, Cambridge, 1956.
  • Fromm, Erich. The Anatomy of Human Destructiveness. Holt, Rinehart and Winston, New York, 1973.
  • Fromm, Erich. Escape from Freedom (Fear of Freedom in the UK). Avon Books, New York, 1969.
  • Fromm, Erich. Man for Himself. Fawcett Premier, Greenwich, Connecticut, 1967.
  • Gentile, Giovanni. Origins and Doctrine of Fascism with Selections from Other Works. Edited by A. James Gregor. Transaction Publishers, Edison, NJ, 2003.
  • Gleason, Abbott. Totalitarianism: The Inner History of the Cold War. Oxford University Press, Oxford and New York, 1995.
  • Hobbes, Thomas. Leviathan. Wordsworth Editions, Hertfordshire, 2014.
  • Hoffmann, Stanley (editor). Political Thought and Political Thinkers. University of Chicago Press, Chicago and London, 1998.
  • Hook, Sidney. Heresy, Yes—Conspiracy, No. Greenwood Press, Publishers, Westport, Connecticut, 1953.
  • Hook Sidney. “Heresy, Yes—But Conspiracy, No.” New York Times Magazine, July 9, 1950. Available online through Dissent Archives.
  • Kołakowski, Leszek. Modernity on Endless Trial. University of Chicago Press, Chicago, 1990.
  • Konvitz, Milton R. and Kennedy, Gail (editors). The American Pragmatists. Meridian Books, UK, 1960.
  • Lenin, Vladimir. The State and Revolution. Penguin Books, Middlesex, 2009.
  • Lilla, Mark. “Arendt and Eichmann: The New Truth.” New York Review of Books, November 21, 2013. Online version.
  • Litwack, Eric B. “Erratum to: Epistemic Arguments against Dictatorships.” Human Affairs 21, 2011, pp. 226-235,.
  • Macpherson, C. B. “Review of The Origins of Totalitarian Democracy.” Past and Present, Number 2, November 1952, pp. 55-57.
  • Patten, Alan. “Review of The Decent Society.” Mind, Volume 110, Number 437, January 2001, pp. 229-232.
  • Plato. Republic. Oxford University Press, Oxford, 1993.
  • Popper, Karl. The Open Society and its Enemies. Routledge Classics, Oxford, 2011.
  • Popper, Karl. The Poverty of Historicism. Routledge, London, 1960.
  • Putnam, Hilary. Renewing Philosophy. Harvard University Press, Cambridge and London, 1992.
  • Schmitt, Carl. Dictatorship. Polity Press, Cambridge, 2013.
  • Schmitt, Carl. The Concept of the Political. University of Chicago Press, Chicago, 2007.
  • Shklar, Judith N. “The Liberalism of Fear,” in Political Thought and Political Thinkers. Hoffman, Stanley (editor). University of Chicago Press, Chicago and London, 1998.
  • Shklar, Judith N. “Putting Cruelty First.” Daedalus Volume 111, Number 3, Summer, 1982, pp. 17-27.
  • Talisse, Robert B. “Politics without Dogmas: Hook’s Basic Ideals.” In Cotter 2004, pp. 117-128.
  • Talmon, J. L. The Origins of Totalitarian Democracy. Penguin Books, Harmondsworth, UK and New York, 1986.
  • Taylor, Charles. Philosophy and the Human Sciences. Philosophical Papers 2. Cambridge University Press, Cambridge and New York, 1985. (Contains “What’s Wrong with Negative Liberty?”)
  • Walzer, Michael. “On Negative Politics.” Yack, Bernard (editor) (1996), pp. 17-24.
  • Westbrook, Robert B. Democratic Hope: Pragmatism and the Politics of Truth. Cornell University Press, Ithaca and London, 2005.
  • Yack, Bernard (editor). Liberalism without Illusions: Essays on Liberal Theory and the Political Vision of Judith N. Shklar. University of Chicago Press, Chicago and London, 1996.

Author Information

Eric B. Litwack
Email: e_litwack@bisc.queensu.ac.uk
Queen’s University and Syracuse University in London
United Kingdom

Thomas Reid: Philosophy of Mind

Thomas ReidThis article focuses on the philosophy of mind of Thomas Reid (1710-1796), as presented in An Inquiry into the Human Mind on the Principles of Common Sense (1764) and Essays on the Intellectual Powers of Man (1785). Reid’s action theory and his views on what makes humans morally worthy agents, although connected to philosophy of mind, are not explored here.

Reid is best known as the father of common sense philosophy. He contends that going back to the principles of common sense will help deal with the problems engendered by the so-called “skeptical views” of his predecessors: Descartes, Locke, Berkeley, and Hume. He argues that “the way of ideas” generates undue uncertainty in the theory of knowledge. If the only things that can be known directly and immediately are the contents of one’s mind, there can be no certainty in the knowledge geared toward the external world. Reid believes this goes against the common-sense view that humans do acquire certain knowledge through empirical observation of the external world, and are therefore not confined to know only the contents of their minds.

In philosophy of mind, Reid is most celebrated today for the arguments he gave in support of the position known as direct realism, which, at its most basic, states that the primary objects of sense perception are physical objects, not ideas in human minds. However, Reid’s philosophy of mind neither begins nor ends with perception. In addition to arguing for direct realism and, consequently, against “the way of ideas,” he undertook the task of establishing the equal status of the faculties of the mind, and of explaining the relationships that exist among them. He is a worthy successor of Locke, in that he believes that the mind is to be characterized in terms of a faculty psychology. He is a worthy successor of Newton, in that he believes that the scientific method is the right way of investigating the nature of mind. Reid characterized the scientific method mainly by trial and error, and by setting up experiments and drawing general conclusions from them.

One of the starting points of Reid’s philosophy of mind is a traditional distinction between the “powers of the understanding” and the “powers of the will.” Reid believes this distinction is not entirely correct because the mind is active whenever the powers of the understanding are exercised, and a certain degree of understanding is needed for any act of will. However, he uses it to classify the faculties of the mind into intellectual, on the one hand, and active, on the other.  The distinction is used in the titles of his two mature published works: Essays on the Intellectual Power of Man (1785) and Essays on the Active Powers of Man (1788), which he envisioned as two sides of the same coin. Reid thought that any theory of the mind should comprise an investigation into both types of mental operations.

Table of Contents

  1. Sensation
  2. Perception
    1. Original Perception
    2. Acquired Perception
  3. Memory
    1. General Considerations on Memory
    2. Memory and Personal Identity
  4. Intellectual Powers (Proper)
    1. Conception
      1. Bare Conception
      2. Imagination
    2. Judgment and Reasoning
      1. The Fundamental Characteristics of Judgment
      2. Common Sense
      3. First Principles of Common Sense
      4. Reasoning
  5. Taste
    1. Why This Faculty Is Called “Internal Taste”
    2. An Objectivist Account of Beauty
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Sensation

Reid argues that sensation is an original and simple operation of the mind, which for him means not only that certain beings (namely sentient ones) are born with an ability to sense, but also that this operation of the mind cannot be logically defined. All natural operations of the mind are simple and, in some sense, primitive, so that no reductive definition can be offered. This does not mean, however, that one cannot pay attention to the specific role played by this operation. In doing so, one will discover its most important features.

Although careful introspective observation will reveal that sensations do not usually occur on their own, but are almost always accompanied by perceptions, Reid is pointing out that a clear-cut distinction between sensation and perception exists and should be accounted for. This distinction has to do primarily with the specific roles sensations and perceptions play in the knowledge of the external world. Sensations are of limited use, in this sense; they only give information of what goes on in the sentient being. Perceptions, on the other hand, contribute to basic repository knowledge. In sensing a smell or tasting a taste, for instance, a sentient being will take notice of how its mind is affected, but, as Reid points out, such sensations bear no resemblance to any of the qualities of the external objects that cause these sensations to occur in the sentient being. Here Reid differs from his predecessors: according to John Locke, for instance, at least some sensations (those derived from the primary qualities of objects) do resemble the external objects which occasion the formation of such simple ideas in sentient beings such as humans (Locke, Essay II. viii. 15). To make the distinction with perception more vivid, Reid discusses an example: in seeing a flower or touching a sugar cube—which involves perceiving and having contentful thoughts about these objects, as is elaborated in the next section—humans gain knowledge about what these external objects really are. There still is no resemblance thesis advanced, to be sure; the mind is simply projected outside itself and, in doing so, it objectifies the things in its environment. In this, Reid is very forward-thinking: he is the first philosopher to draw a distinction between sensation and perception, which is extensively employed in contemporary philosophy of mind and psychology (as J. J. Gibson rightfully noticed).

This distinction between sensation and perception rests primarily on a peculiarity of the faculty of sensation: Reid believes that this is the only operation of the mind that “hath no object distinct from the act itself” (EIP I. 1, 36). He acknowledges the fact that human language is misleading in this respect: for instance, for both sensation and perception, people use “the same mode of expression” (IHM 6.20, 167). This mode of expression involves an active verb and an object: one can say both that “I feel a pain” and that “I see a tree” (IHM 6.20, 167). But, Reid contends, in the former case the object itself is grammatical only, and not also real, whereas in the latter the object is a real thing, allegedly existing outside the perceiver’s mind.

It is less clear what Reid means when he says that the object is not real, but grammatical only, in the case of the construction expressing a sensation that one may feel. There are two ways of interpreting this claim, and this ambiguity tracks two distinct positions in the secondary literature on Reid. On the one hand, sensations, for Reid, can be understood to not have objects at all: as such, this mental operation is distinct from all others. If we understand sensation to have no object, to be about nothing, it cannot ever be wrong. This would mark sensation as a very special faculty among the faculties of the human mind; perception or memory are not like this: someone can misperceive a tree just as well as he can misremember having seen a tree. But a person can never be mistaken about a feeling that particular person has: whenever someone has a headache, that ache is real and it is that person’s and it is exactly as that person is feeling it. On the other hand, that passage has been read as saying that sensations take themselves as objects; Reid, in this interpretation, would subscribe to a reflexive view of sensations. Just like perceptions and memories, sensations are constituted by two other ingredients: a conception of the object, and a belief that the object exists, except, in the case of sensation, this object is the sensation itself, not an external object like trees, frogs, or human beings.

A consequence of understanding Reid as saying that sensations do not have any kind of objects is to think that he is a precursor of “adverbial” theories of sensation. In this account, a sentient being is not said to have a sensation of a red object, but to sense in a certain way whenever stimulated in the right manner. Sensations inform the sentient being of various ways of feeling: there is a particular way of feeling redly, as opposed to a particular way of feeling yellowly, and there is yet another way of feeling headachely (see also Sense-Data). Understanding that sensations provide us with a qualitative feel and making sense of what exactly this means has become very important in early 21st century discussions on the nature of mind and consciousness. According to some authors, such as David Chalmers, Frank Jackson, Joseph Levine, and Thomas Nagel, qualia offer sufficient proof that a complete reduction of all mental processes to purely physical processes (as described by a physicalist interpretation of brain processes) is impossible (for more, see Qualia). So, understanding Reid’s position in this manner will place him squarely in the same tradition as one of the most important debates in contemporary philosophy of mind.

The last attribute of sensations worth mentioning is their role as signs of external objects. Usually, sensations pass unnoticed (unless the sentient being carefully attends to them) to other things that they signify. This feature of sensations allows Reid to argue that they are never to be associated to Lockean ideas (Locke, Essay II. viii. 8): they are not the objects of perception, and, moreover, they are not mental intermediaries between the mind and the world. Perception of external objects turns out to be immediate, in Reid’s view (Reid on sensations as signs: IHM 2. 10, 43; IHM 4. 1, 49; IHM 6. 21, 177). To properly understand the role of sensations as signs of external objects, according to Reid, an analysis of perception should be given, a task undertaken in the next section.

2. Perception

Perception is the main faculty that has the role to give beings endowed with this faculty brute knowledge about the external world: the knowledge is brute because no reasoning enters perception; and the result is knowledge, even though sometimes when the perceiver believes that something is being perceived, something is actually being either perceptually illusioned or hallucinated. However, even when a perceptual state results in a false outcome, the state itself should be characterized as perception (for more on how and why perception can be non-veridical, see EIP II. 22, 241–252). So, this is how sensations, as signs of external things, work to connect minds with external things. Reid argues that:

[A] requisite to our knowing things by signs is, that the appearance of the sign to the mind, be followed by the conception and belief of the thing signified. Without this the sign is not understood or interpreted; and therefore is no sign to us. […] Now, there are three ways in which the mind passes from the appearance of a natural sign to the conception and belief of the thing signified; by original principles of our constitution, by custom, and by reasoning. (IHM 6. 21, 177)

This passage is important in several respects: (i) it gives Reid’s “official” characterization of perception, and (ii) it lays the foundation for an important distinction at the level of perception. These two aspects are discussed in turn.

First, Reid argues that “the appearance of the sign” is followed by a conception and belief of the thing signified. When Reid gives his official characterization of perception he states that this faculty involves several others: the occurrence of a sensation suggests a conception and a belief of the existence of the thing perceived. Moreover, this existential belief is immediate, and not the product of reasoning (EIP II. 6, 96). If it were the product of reasoning, “the greatest part of men would be destitute of [the information had of external objects]; for the greater part of men hardly ever learn to reason; and in infancy and childhood no man can reason” (EIP II. 6, 101). Perception, therefore, must be able to occur independently from any act of reasoning.

The second feature of perception that the passage quoted above refers to is the distinction Reid draws between original perception and acquired perception: in the case of original perception, a natural sign (that is, a sensation) suggests a conception and a belief “by original principles of our constitution.” In the case of acquired perception, by contrast, the natural sign in question suggests a conception and a belief “by custom,” which most probably means “habit” and/or “experience.” Let us take a closer look at this distinction by pinning down some of the essential features of original perception, and by emphasizing some of the points of departures from this model, in the case of acquired perception.

a. Original Perception

According to Reid (IHM 6. 20, 171 and EIP II. 21), only two of the senses give beings endowed with them original perceptions, namely those of touch and sight. The sense of sight is somewhat problematic in this respect, though, since vision does not provide creatures endowed with it with original visual perceptions of some things, for instance depth, but only with acquired ones. In original tactile perception, the sensation had of the so-called “primary qualities of bodies” immediately suggests a conception and belief of the existence of these qualities, and of substances in which such qualities inhere. In original visual perception, the sensations of colors suggest conceptions and beliefs of the existence of the so-called “secondary quality” of color as existing outside of minds, in an external object. The perception of visible figure is also supposed to be original, according to Reid and, according to the standard interpretation of Reid, it is not accompanied by any type of visual sensation whatsoever. Why does Reid think that only two of the senses—touch and vision—can give beings that have them original perceptions?  Why cannot smell, taste, and hearing provide such beings with original perceptions? Can this have anything to do with the distinction between primary and secondary qualities of objects?  This is a good place to offer some details on Reid’s view of the distinction between primary and secondary qualities of objects. As previously mentioned, Reid thinks that Locke was wrong to believe that there is some resemblance between primary qualities of objects and the ideas or sensations sentient beings have of them. However, Reid himself draws a distinction between these two types of properties of objects:

There appears to be a real foundation for the distinction, and it is this: That our senses give us a direct and distinct notion of the primary qualities, and inform us what they are in themselves: But of the secondary qualities, our senses give us only a relative and obscure notion. They inform us only, that they are qualities that affect us in a certain manner, that is, produce in us a certain sensation; but as to what they are in themselves, our senses leave us in the dark. ([emphasis added]; EIP II. 17, 201)

Reid argues that knowledge of primary qualities—like squareness, or hardness, or motion—is direct: it captures everything there is to know about such a quality. Squareness, hardness, motion, and all the other mathematical qualities of bodies are known intrinsically. The conception human beings have of secondary qualities, like color, for instance, is not like this; hence it does not constitute knowledge. All there is to know about a secondary quality is that sentient beings are constituted in such a way that whenever a normal being is in contact with the color red, under normal conditions, that being gets a sensation, which is different in what it feels like to that being from the sensation that same being gets whenever it is stimulated with the color yellow, under normal conditions. Other examples of primary qualities of bodies include shape, size, and solidity. Besides color, other examples of secondary qualities are heat, cold, smell, and taste..

This distinction is important for understanding Reid’s view of original perception, since one way of drawing this distinction is by reference to what kinds of things can be originally perceived, as opposed to what kinds of things can be perceived only in an acquired manner. It might seem that the distinction between original and acquired perception is essentially linked with the more traditional one between primary and secondary qualities of bodies. This is indeed what several scholars have argued, citing as main evidence for this interpretation the fact that human beings have direct conceptions only of primary qualities. Based on this type of conception, human beings gain knowledge only of primary qualities and, if perception is supposed to give perceivers knowledge, as Reid thinks, it seems clear that perceivers can perceive only primary qualities of bodies, since perceivers do not gain any knowledge, by their senses, of secondary qualities. This argument seems correct, but it has a severe uphill battle because Reid specifically and consistently places color, a secondary quality, on the list of things that can be originally perceived (IHM 6. 20 p. 171; EIP II. 21, p. 236). So, if we are to listen to Reid, the distinction between primary and secondary qualities, on the one hand, and the distinction between original and acquired perception, on the other, do not carve the world in the same way. The distinction between original and acquired perception, therefore, must be clarified in a different way.

b. Acquired Perception

Acquired perception is distinguished from original perception primarily by the role of learning and experience. There is no need for any type of experience, according to Reid, for human beings to be able to perceive the primary qualities of bodies and the bodies themselves by touching them, for instance. However, one must learn to associate a certain sign that conjures up an original perception or a sensation only, with a certain external object. There is a controversy in the literature concerning what exactly this learning involves: according to some authors (for example, Van Cleve (2004)), it initially involves inference or reasoning, thus excluding anything that we acquire in this way from the list of things that we actually perceive, since perception, for Reid is a faculty that does not rest on the perceiver’s reasoning powers, as indicated in the previous section (EIP II. 6, 101). According to other authors (such as Copenhaver (2010)), however, acquired perception never involves any type of reasoning. Rather, Reid intended acquired perception to be understood as a distinctively perceptual ability: with the passage of time, normal perceivers acquire more perceptual sensitivity to properties not represented in original perception. Here is Reid explaining how this happens in the case of perception of depth and three-dimensional figure by sight:

It is experience that teaches me that the variation of colour is an effect of spherical convexity […]. But so rapid is the progress of the thought, from the effect to the cause, that we attend only to the last, and can hardly be persuaded that we do not immediately see the three dimensions of the sphere. (EIP II.21, 236)

The fact that this type of ability is called “acquired” should not suggest that it is less natural than the original variety. Beings endowed with the ability to develop acquired perception do not develop this ability consciously or only because they decide to acquire certain perceptions. Here is what Reid says concerning this:

In acquired perception, the signs are either sensations, or things which we perceive by means of sensations. The connection between the sign and the thing signified, is established by nature: and we discover this connection by experience; but not without the aid of our original perceptions, or of those which we have already acquired. After this connection is discovered, the sign, in like manner as in original perception, always suggests the thing signified, and creates the belief of it. (IHM 6. 24, 191)

Acquired perception thus builds upon the original abilities of sensing and originally perceiving things in nature that human beings have. In acquired perception, in contrast to original perception, the conventional associations between signs and things signified are introduced by a combination between nature and experience. In original perception, these conventions are the result of nature alone: this is the way humans are constituted. Reid believes that acquired perceptions are far more numerous than original perceptions (EIP XXI. 21, 235).

3. Memory

a. General Considerations on Memory

Memory, for Reid, is the perfect counterpart to perception: it is an original faculty of minds, which is meant to give beings endowed with it immediate access to the past. He argues that it is a first principle of common sense “[t]hat those things did really happen which I distinctly remember” ([emphasis added]; EIP VI. 5, p. 474) and that the knowledge that memory gives is “immediate knowledge of things past” (EIP II. 1, p. 253). No mental entities, such as ideas, mediate a being’s access to the external world in memory, just like no such entities mediate such a being’s access to the world in perception. There are three things involved in perception, and, similarly, there are three things involved in memory: a mind, a faculty, and an external object, which the mind gains knowledge of via the faculty in question. For Reid, “[m]emory implies a conception and belief of past duration” (EIP III. 1, p. 254). This formulation mirrors the one that he gave to further explain how perception operates, although both in the case of memory and of perception, these explanations are not definitions, since both these faculties are simple, and hence cannot be reductively defined by analyzing their components. The external object, in the case of perception, is (allegedly) presently existing; the external object, in the case of memory, was (allegedly) existing in the past of the mind having the memory in question. Beings endowed with perception can be said to mis-perceive things—which are either different than they appear to be or do not exist altogether; and beings endowed with memory can be said to mis-remember things—which were either different than they appeared to such beings or did not exist altogether. To present his own views on memory, Reid starts by first criticizing his precursors, primarily Locke and Hume, for operating with a so-called “store-house” model of memory. Contrary to what he takes Locke and Hume to be saying, memory is not a repository for ideas, which can be revived, whenever the person who had those ideas needs them again (for example, Locke, Essay II.xx.2). The main problem here, according to Reid, is that if an idea could indeed be revived in this way, that idea would be perceived again, and not actually remembered. This is because, as Reid understands them, Locke and Hume argue that ideas are the immediate objects of perception. So, whenever an idea is present to the mind—whether for the first time or when it is revived—the mind should be said to perceive it. What does memory contribute here, Reid asks?  Even though Reid is not the most charitable interpreter of Locke or of Hume, some of the criticisms he raises are cogent. There is a threat of circularity in the account of memory offered by both Locke and Hume, as Reid understands them. Both Locke and Hume’s accounts of memory seem to presuppose memory, rather than explain it: the ability to understand that a certain idea that is now present to the mind is exactly the same, qualitatively, and not numerically (since both Locke and Hume believe that ideas are fleeting), as an idea that was present to the mind at a previous moment of time, needs memory. The problem is that no idea contains any information, qualitative or representational, that could be used to identify that idea as being about the past.

So, what is Reid’s positive account of memory?  Here is what he says at the beginning of the Essay on memory:

Things remembered must be things formerly perceived or known. I remember the transit of Venus over the sun in the year 1769. I must therefore have perceived it at the time it happened, otherwise I could not now remember it. Our first acquaintance with any object of thought cannot be by remembrance. Memory can only produce continuance or renewal of a former acquaintance with the thing remembered. (EIP III. 1, p. 253-55)

This suggests that Reid is operating with a precursor of a distinction used in the psychological literature of the twentieth century, as advanced primarily by Tulving (1983). According to Tulving (1983) there are two main types of long-term memory: procedural—whereby one remembers how to perform certain actions (for instance, one remembers how to ride a bike or how to bake a cake), and declarative. This latter type is itself divided into episodic memory—whereby one remembers an experience that one underwent or an event one witnessed (for example, somebody remembers running in her first 5K race); and semantic memory—whereby one remembers that so-and-so is the case, where the fact remembered may be something that happened before one’s time (such as when one remembers that Napoleon was defeated at Waterloo). Semantic memory is further distinguished from the episodic kind by the so-called “previous awareness condition” on episodic memory, which requires for someone to have been there in a capacity of witness or agent of an event, for that event to be episodically remembered. Reid thinks that something like the previous awareness condition on episodic memory must be satisfied in cases like the one quoted above: for someone to remember something, that person must have perceived that thing at an earlier moment of time.

There is a debate among Reid scholars concerning this very issue: did Reid think that all memory should be understood as episodic, or did he have room in his theory for semantic memory as well?  Some authors believe that, for Reid, all memory is episodic (for instance, Van Woudenberg (1999)); others believe that Reid was concerned with both semantic and episodic memory (such as Copenhaver (2009)). The consensus in the literature is, however, that Reid had nothing very interesting to say about procedural memory. This is important since it shows Reid to be very forward-thinking in his treatment of memory: he believes that episodic memory is fundamental for a being’s immediate knowledge of its past.

So, how does memory connect a being endowed with such a faculty with past events?  According to Reid, memory does not offer a being endowed with this faculty a present connection with an event experienced in the past. The access to past events is not by re-acquaintance, as Locke or Hume would say. The past acquaintance of the event itself is preserved through the conception and belief deployed in a memorial experience. This is because, according to Reid, apprehension, when employed by another faculty, such as perception and consciousness, is strictly related to the present moment:

It is by memory that we have an immediate knowledge of things past: The senses give us information of things only as they exist in the present moment; and this information, if it were not preserved by memory, would vanish instantly, and leave us as ignorant as if it had never been. (EIP III. 1, p. 253)

b. Memory and Personal Identity

Reid is famous for his criticism of Locke’s theory of personal identity. The success of this criticism depends on the explanation of the relationship that perception and consciousness, on the one hand, and memory, on the other, have with time. Perception and consciousness give a being endowed with such faculties immediate knowledge of presently existing things: of how the external world is, and of how the mental operations of the minds of such beings succeed one another, respectively. Memory, on the other hand, gives beings endowed with this faculty immediate knowledge of things past; and these things can be, in turn, external or internal. Someone can remember, for instance, having a certain nauseating sensation upon encountering some rotten food. That person will not only remember the state of the food, in this case, but also his having a certain unpleasant sensation.

Reid finds Locke’s theory of personal identity lacking on two counts: (i) first, Locke suggests that consciousness can extend to the past (Essay II.xxvii.9); (ii) second, Reid thinks that Locke is claiming that personal identity consists in memory—sometimes this theory of personal identity is called “the memory theory of personal identity.” The two issues are related, and the first one might very well be terminological: what Locke meant by “consciousness,” in this context, Reid means by “memory”:

Mr Locke attributes to consciousness the conviction we have of our past actions, as if a man may now be conscious of what he did twenty years ago. It is impossible to understand the meaning of this, unless by consciousness is meant memory, the only faculty by which we have an immediate knowledge of our past actions. (EIP III. 6, p. 277)

The second issue is more serious. The problem has to do with the fact that Locke seems to require sameness of memory for sameness of person. The type of memory involved here is episodic memory, and this might be why Locke thinks that consciousness is something that is needed here: in order to remember something about oneself episodically, a person must remember the event “from the inside.” For instance, if someone remembers, episodically, having run a 5K race this past Sunday, that person cannot be mistaken regarding who was the agent of the act of running in the race. That particular person also could not be mistaken about what it felt like to run a 5K this past Sunday. These are all characteristics of episodic memory. Furthermore, if that particular person cannot be mistaken with regard to who was the agent of this act of running (namely that person himself), then that particular person must have existed this past Sunday, at the time of the race. In thinking that memory is necessary for personal identity, Locke doesn’t seem to commit a grave error of reasoning.

Reid, however, argues that this account is absurd, because it leads to absurd consequences. To show that he is right, Reid discusses the now famous case of the brave officer:

Suppose a brave officer to have been flogged when a boy at school, for robbing an orchard, to have taken a standard from the enemy in his first campaign, and to have been made a general in advanced life: Suppose also, which must be admitted to be possible, that when made a general he was conscious of his taking the standard, but had absolutely lost the consciousness of his flogging.

These things being supposed, it follows, from Mr. LOCKE’s doctrine, that he who was flogged at school is the same person who took the standard, and that he who took the standard is the same person who was made a general. When it follows, if there be any truth in logic, that the general is the same person with him who was flogged at school. But the general’s consciousness does not reach so far back as his flogging, therefore, according to Mr. LOCKE’s doctrine, he is not the person who was flogged. Therefore the general is, and at the same time is not the same person as him who was flogged at school. (EIP III. 6, p. 276)

This case, which builds upon an objection raised by Joseph Butler (1736), is supposed to show that personal identity, understood as consisting in memory, is not a consistent notion. Here is why: due to the transitivity of numerical identity, the old general should be numerically identical with the kid who was flogged for robbing an orchard. This should follow, on the assumption that the kid who was flogged is numerically the same as the brave officer, who, in turn, is supposed to be numerically the same as the old general. Memory ensures that the boy who was flogged is the same as the brave officer, since the brave officer remembers that incident from his childhood. It ensures, moreover, that the general is the same as the brave officer, since the general remembers (episodically) that event from his youth. But, on Locke’s theory of identity, Reid claims, the general is not the same person as the kid robbing the orchard, since the general does not remember (episodically) that event from his childhood. There are two possibilities: (i) either to explain personal identity without making recourse to numerical identity, since transitivity holds for numerical identity, but this example shows that transitivity fails for personal identity. Or, (ii) to give up Locke’s theory of personal identity, since any theory that does not respect the rules of logic is irremediably flawed.

Reid chooses (ii) and argues that memory is neither necessary nor sufficient for personal identity. Memory is not a necessary condition for personhood, since during their lives, human beings witness or are the agents of many events of which they have no recollection at later moments of time. However, it would be absurd to claim that just because someone doesn’t remember something having happened, that person wasn’t actually there. Here is what Reid says on the issue: “I may have other good evidence of things which befell me, and which I do not remember: I know who bare me, and suckled me, but I do not remember these events” (EIP III. 4, p. 264). Neither is memory a sufficient condition for personal identity, according to Reid, since even though someone may be able to remember episodically that he was the agent or the witness of an event, it is not his remembering the event that makes it the case that he himself is the same person he was then. “It may here be observed […] that it is not my remembering any action of mine that makes me to be the person who did it. This remembrance makes me to know assuredly that I did it; but I might have done it, though I did not remember it” (EIP III. 4, p. 265). Memory gives someone immediate knowledge of a past event that person was the witness to or agent of, but it does not ensure that that person was actually there at the time of the event.

Reid’s theory of personal identity is deflationary: he argues that this notion is primitive. The only way to understand more about this relation is by contrast to other relations: “I can say that diversity is a contrary notion, and that similitude and dissimilitude are another couple of contrary relations, which every man easily distinguishes in his conception from identity and diversity” (EIP III. 4, p. 263). Just like Locke before him, Reid acknowledges that identity, in general (thus including the special case of personal identity), presupposes “an uninterrupted continuance of existence” (EIP III. 4, p. 263). Due to this feature of identity, there is no way to think that mental states and processes remain identical over time:

Hence we may infer, that identity cannot, in its proper sense, be applied to our pains, our pleasures, our thoughts, or any operation of our minds. The pain felt this day is not the same individual pain which I felt yesterday, though they may be similar in kind and degree, and have the same cause. The same may be said of every feeling, and of every operation of the mind: They are all successive in their nature like time itself, no two moments of which can be the same moment. (EIP III. 4, p. 263)

Thus, Reid thinks that persons should not be identified with their thoughts or feelings, but with the subject of such thoughts and feelings, which remains the same over time. This subject is an immaterial substance, a soul, which is best understood by reference to Leibniz’s notion of a monad (EIP III. 4, p. 264).

4. Intellectual Powers (Proper)

a. Conception

The fourth Essay is dedicated to conception, whose primary role is to be an ingredient (or concomitant) in all other operations of the mind. In this picture, conception is being used as part of the endeavor to gain knowledge of the external world (when it is employed by the senses), of the internal world (when it is employed by consciousness), and also to analyze the complex relationships that exist among the objects of the world, among numbers in mathematics, and among rules of reasoning in logic. As such, conception is a faculty that acts as a bridge, connecting the information gathered by the senses with the intellectual processing powers of judgment and reasoning.

Since conception is a simple operation of the mind, it cannot be subjected to a reductive definition any more than the other operations can be. However, as always, Reid argues that it has certain features which are useful to know in order to better understand how it functions, both when it is an ingredient or concomitant of other operations, and when it is employed on its own, as “bare conception.”

Reid argues that conception is an ingredient in all of the other operations of the human mind:

Our senses cannot give us the belief of any object, without giving some conception of it at the same time: No man can either remember or reason about things of which he hath no conception: When we will to exert any of our active powers, there must be some conception what we will to do: There can be no desire or aversion, love nor hatred, without some conception of the object: We cannot feel pain without conceiving it, though we can conceive it without feeling it. These things are self-evident. (EIP IV. 1, 296)

As already pointed out, the argument that sensations must be intentional, and hence take themselves as objects, is based on this idea that every operation of the mind has conception as an ingredient. The passage quoted above can indeed be read as saying that one must conceive of the pain one is feeling at a given moment of time in order to actually be able to feel it. However, it is controversial in Reid scholarship what exactly “conception” is supposed to mean in this context, despite its name. The issue concerns the fact that Reid believes that human beings share most of their perceptual and sensory abilities with lower-level animals and with human infants, who do not have a well-developed conceptual framework; thus, some authors argue that “conception” should not be taken to mean that unless one is able to have and deploy fully formed concepts, one will not be able to feel pain, for instance. In this interpretation, conception should be understood as the operation that allows beings endowed with this faculty to get acquainted with an object, be that object something that exists in the present, existed in the past, or will never exist.

On the other side of this controversy are those authors who point out that it is rather counter-intuitive to believe that conception does not operate via concepts—after all, the name might be indicative of something here. The role of conception, as an ingredient in all the other operations of the human mind, is to allow humans to secure a mental grip on something. Such a mental grip is secured by deploying a singular concept, understood to be something like a uniquely identifying definite description. In this interpretation, a being would not be able to have a sensation, a perception, or a memory unless it was able to deploy a singular concept, a uniquely identifying definite description isolating that thing in the world.

i. Bare Conception

Reid calls conception, as employed on its own, and not as an ingredient in any of the other operations of the human mind, “bare conception.” This suggests that when employed on its own, conception has a different role than when employed by a faculty of the mind in which it enters as an ingredient: “yet it may be found naked, detached from all others, and then it is called simple apprehension, or the bare conception of a thing” (EIP IV. 1, p. 286).

One of the most interesting features of bare conception is its ability to be used to think about objects without any heed being paid to their existence or non-existence, and also about propositions, without any judgment of their truth or falsity.

In bare conception there can neither be truth nor falsehood, because it neither affirms nor denies. Every judgment, and every proposition by which judgment is expressed, must be true or false; and the qualities of true and false, in their proper sense, can belong to nothing but to judgments, or to propositions which express judgment. In the bare conception of a thing there is no judgment, opinion, or belief included, and therefore it cannot be either true or false. (EIP IV. 1, p. 296)

Conception, in this sense, is that faculty allowing human beings to grasp the meaning of a proposition, which is the prerequisite for being able to judge a certain proposition as true or false: “it is one thing to conceive the meaning of a proposition; it is another to judge it to be true or false” (EIP I. 1, p. 25). Things are being conceived by beings endowed with this faculty in the following manner: an object is brought before the mind, with the help of conception: “I conceive an Egyptian pyramid. […] the thing conceived may be no proposition, but a simple term only, as a pyramid, an obelisk” (EIP I. 1, p. 25). Bare conception seems to require the mind of the conceiver to use certain concepts—simple terms—to bring forth objects to the mind in a way in which conception, when employed as an ingredient in other operations of the human mind, does not. This should not be surprising, though: once someone is able to think about something, even when he is not perceiving or remembering it, his mind will have established a certain grasp of that thing, classified and analyzed it, such that he will be able to think about it without using any of his other faculties. How this comes about will be better understood once Reid’s accounts of abstraction, judgment, and reasoning are presented, but it is already worth noting that it is not conception that supplies the mind with the most simple and exact notions the mind has of external things; these are acquired by using the mind’s superior reasoning powers (EIP IV. 1, p. 309).

Ideas as acts of the mind. Bare conception can be understood by analogy with painting, Reid argues, but he warns us that analogous thinking can take us only so far. Conception should be distinguished from painting, since “[t]he action of painting is one thing, the picture produced is another thing. The first is the cause, the second is the effect” (EIP IV. 1, p. 300). Reid’s worry is that that conception will be thought to work in the same way, to produce images of things in the mind, or ideas. Reid denies that this is the case, and puts forward a theory of ideas as acts of minds rather than objects of such mental operations: “Let this therefore be always remembered, that what is commonly called the image of a thing in the mind, is no more than the act or operation of the mind in conceiving it” (EIP, IV. 1, p. 300). To unpack this further, let us think about the elements involved in conceiving that the sun is yellow, for instance. Reid argues that in this act of conception, there are the following three elements: a mind, an act of conception that the sun is yellow, and the thing itself—the sun—external to the mind in question. Furthermore, he argues that there is something missing: an image in the mind, an additional representation, that has the explicit content of a yellow sun. He is willing to assert that this is just a verbal dispute, if everyone else is willing to agree with him that these images in the mind, or ideas, are nothing more than acts of conceiving—a moot point, given that everyone else was dead at the point when he was writing, and no one could have agreed with him. But, in effect, this is a serious conceptual point.

The analogy with painting should help classify conceptions into three classes, according to Reid. Just like a painter paints by using his imagination, by copying from other paintings, or by painting live subjects, there are conceptions which can be called “creatures of fancy”—like Don Quixote or Pegasus; conceptions of universals—which are analogous to paintings which copy other paintings; and conceptions of individual (existing) things—which are like paintings of live subjects.

Our conceptions, therefore, appear to be of three kinds: They are either the conceptions of individual things, the creatures of God; or they are conceptions of the meaning of general words; or they are the creatures of our own imagination. (EIP IV. 1, p. 305)

There are two issues worthy of attention in this classification: (i) Reid argues that people can name the creatures of fancy they invent, “conceive them distinctly, and reason consequentially concerning them, though they never had an existence” (EIP IV. 1, p. 301-2). And (ii) conceiving universals—like kinds and species of things—means nothing more nor less than to conceive the “meaning which other men who understand the language affix to the same words” (EIP IV. 1, p. 302). The first of these issues shows Reid to think that it is possible for fictional names to be used in the same way as regular names, even though the former category will be used to name nonexistents.

Reid’s Meinongeanism. Based on Reid’s idea that people can think and “reason consequentially” about fictional characters and objects, Nichols (2002) argued that Reid is a precursor of Meinong. Reid’s rejection of the way of ideas and his dedication to common sense philosophy are thought to amount to a rejection of the position according to which conceiving the nonexistent means nothing more than conceiving images or any other types of mental intermediaries. Centaurs, not centaur-inspired images or ideas, are the objects of such centaur thoughts. The only exception is constituted by a thought which is explicitly about a painting of a centaur, in which case it should be obvious to everyone that what is being conceived is an image, and not a mythological animal.

This one object which I conceive [a centaur], is not the image of an animal, it is an animal. I know what it is to conceive an image of an animal, and what it is to conceive an animal; and I can distinguish the one of these from the other without any danger of mistake. (EIP, IV. 2, p. 321)

Reid does not talk about different levels of existence; there is no doubt that centaurs do not exist as flesh-and-blood animals. It is important, however, to note that Reid ascribes intentionality to all the operations of the human mind, and this intentionality is to be resolved by understanding how conception works.

ii. Imagination

At the beginning of the EIP, when Reid is defining the terms he is going to use throughout the book, and at the beginning of the fourth Essay, where he lays down his views on conception, he claims that “conception” and “imagination” are synonymous words, and, moreover, that no reductive definition of these notions can be given, since they are supposed to denote simple operations of the mind. However, in the course of his analysis of conception, it becomes clear that imagination is not exactly the same thing as conception.

Reid argues that “imagination,” when used with its proper meaning, denotes a type of conception that is concerned primarily with the objects of sight (EIP IV. 3, p. 326). This restriction to sight probably has more to do with etymology than with the proper meaning of “imagination.” Imagination is supposed to apply to other senses, although Reid thinks that such uses are not altogether proper (EIP V. 6, p. 394). Any conception is of the imaginative kind when it is lively and about possible objects of sense. One consequence is that people can never be said to imagine universals, or propositions; neither are people supposed to think that anyone is imagining objects of sense, when they are actually perceiving them. A different kind of conception is responsible for the proper workings of perception.

Reid’s distinction between conception proper and imagination is one of the first instances in philosophy of mind in which imagination is presented as a faculty of the human mind related most closely to perception. Reid’s main breakthrough is his arguing that conception proper is used for understanding and acquiring general and abstract concepts, while imagination is used to think about things that might have existed, and, as such, might have presented beings endowed with such a faculty or system with perceptual stimuli.

b. Judgment and Reasoning

Reid dedicates two essays to the mental powers of judgment and reasoning with which he believes human beings to be endowed by nature. Essay VI, the one dedicated to judgment, presents the main elements of what Reid takes to be the philosophy of common sense. After a general introduction, in which he describes the fundamental characteristics of judgment, Reid argues that certain principles should be taken for granted as true. These are the first principles of common sense, which describe how the external and internal worlds work. These principles are self-evident and as such their truth cannot be demonstrated through any kind of reasoning. In the following essay, dedicated to reasoning, Reid argues that it is the purview of this faculty to produce judgments, or to combine and analyze them, in two main ways: deductively or probably. In what follows, these issues are discussed in turn, by first explaining what Reid thought about judgment, and then providing a schematic account of how deductive reasoning is supposed to be applied to the class of necessary truths, while probable reasoning is supposed to be applied to the class of contingent truths.

i. The Fundamental Characteristics of Judgment

Reid talks about judging in terms of offering mental assent or dissent to the issues represented by any particular judgment. Reid thinks that if human beings were not endowed with such an operation, they would not be able to reason abstractly. Without analyzing, abstracting, and judging when they reached correct conclusions, human beings would have been given reasoning in vain:

[S]ome exercise of judgment is necessary in the formation of all abstract and general conceptions, whether more simple or more complex; in dividing, in defining, and in general, in forming all clear and distinct conceptions of things, which are the only fit materials of reasoning. (EIP VI. 1, p. 413)

Some authors argue that judging should not be understood as involving just mental affirming or denial of its content, since that would not distinguish judging from believing. Although Reid’s official characterization of judgment is meant to clarify how this mental operation accompanies all others, belief already implies a mental assent/dissent given to its content. In the picture Reid is putting forward, there seems to be no way to explain why somebody would assent (dissent) to something without that person’s already having a belief that it is true (or false). Judgment, therefore, seems to presuppose belief. Judgment, then, would simply be superfluous, while belief would be ubiquitous, either as a concomitant or an ingredient in all other operations of the human mind (Rysiew (2004): 65). This, however, contradicts Reid’s official characterization of judgment:

[A] man who feels pain, judges and believes that he is really pained. The man who perceives an object, believes that it exists, and is what he distinctly perceives it to be; nor is it in his power to avoid such judgment. And the like may be said of memory, and of consciousness. Whether judgment ought to be called a necessary concomitant of these operations, or rather a part or ingredient of them, I do not dispute. But it is certain, that all of them are accompanied with a determination that something is true or false, and a consequent belief. If this determination be not judgment, it is an operation that has got no name; for it is not simple apprehension, neither is it reasoning; it is a mental affirmation or negation; it may be expressed by a proposition affirmative or negative, and it is accompanied with the firmest belief. (EIP VI. 1, p. 409)

To save Reid from this inconsistency, some have argued that the distinctive character of judgment emerges not from his official characterization of this mental operation, but rather from his comparing it to an external, real-life tribunal. This analogy is not perfect, and per Reid’s instructions (EIP I. 4, p. 55), people should not be lulled into a sense of confidence that they really know what they are talking about when they invoke analogous thinking, especially with regard to analogies concerning the body—or all things external—and mind—or all things internal. However, people are entitled to use the same name—“judgment”—to refer to both the process that results in an assenting/dissenting opinion in a court of law, and to the one that results in an assenting/dissenting belief in the internal tribunal, in virtue of the process involving reasoned reflection and deliberation. The fundamental characteristic of judgment in Reid’s system is its deliberative/reflective character, and not its relation to assent or dissent, which is, in turn, reserved for belief (Rysiew (2004): 67).

ii. Common Sense

Reid argues that sense and judgment are intrinsically related, such that sense always implies judgment: “A man of sense is a man of judgment” (EIP VI. 2, p. 424). He believes this to hold true both for what he calls “the external senses” (for instance, touch, taste, sight) and for the so-called “internal senses” (for instance, moral sense and internal taste). Since Reid believes (mistakenly, as it was discussed above) that judgment is the operation of the mind that helps people determine, “concerning any thing that might be expressed by a proposition, whether it be true or false” (EIP VI. 3, p. 435), and since he talks about common sense in the Essay dedicated to illuminating the nature of judgment, it should be obvious that he thinks that common sense is a specialized kind of judgment, understood as a faculty of the human mind. To wit, Reid thinks that common sense is that minimal degree of understanding that every adult human being possesses (or should possess), such that he can function well in this world. Common sense is concerned only with propositions that express self-evident truths (or falsehoods); judgment, more generally, is concerned with propositions that express any other kinds of truths or falsehoods.

Reid believes that self-evident principles are at the foundation of any kind of knowledge and that common sense is the mental operation that discovers such principles for human beings:

All knowledge, and all science, must be built upon principles that are self-evident; and of such principles, every man who has common sense is a competent judge, when he conceives them distinctly. Hence it is, that disputes often terminate in an appeal to common sense. (EIP VI. 2, p. 426)

This suggests that Reid thinks that human beings are all endowed with a mental operation—common sense—that is meant to discover the first principles upon which any kind of science is built. These first principles, when considered distinctly, namely in isolation from anything else, will be immediately found to be true, just as anything parading as a first principle, when considered distinctly, will be found to be false. No one undergoes a complicated reasoning procedure to discover the truth (or falsehood) of such principles; everyone just knows this, because, in being self-evident, these principles wear their truths conspicuously. In other words, what results from exercising the faculty of common sense is intuitive knowledge. Reid explains that reason and common sense do not conflict, because common sense is part of reason, just as judging does not oppose reason:

We ascribe to reason two offices, or two degrees. The first is to judge of things self-evident; the second to draw conclusions that are not self-evident from those that are. The first of these is the province, and the sole province of common sense; and therefore it coincides with reason in its whole extent, and is only another name for one branch or one degree of reasoning. (EIP VI. 2, p. 433)

Deduction from true principles can never contradict common sense, since “truth will always be consistent with itself” (EIP VI. 2, p. 433).

iii. First Principles of Common Sense

Reid thus believes that human beings are endowed with a faculty that gives them immediate knowledge of self-evident principles. He calls this faculty “common sense,” but it is more common to refer to the results of employing this faculty by the name of “intuitive knowledge.” The main idea here is that such knowledge of first principles is widespread: for instance, people are said to intuit axioms in mathematics and in logic; they also are thought to intuit first principles in morals, just as they intuit first principles regarding the expression of beauty in the arts, Reid believes. This knowledge is not innate; after all, as an Empiricist, Reid thinks that all knowledge is acquired. The faculty of common sense, just like all the other original faculties, is innate, in the sense that they are part of the mental architecture of a human being. The sense in which this intuitive knowledge is immediate, without it being innate is the following: once reasoning and the ability to process a human language are sufficiently developed, a human being will be able to know, non-inferentially, that certain propositions, when considered distinctly, are true.

Reid calls such propositions first principles, and he argues that they can be divided into two classes: first principles of contingent truths, on the one hand, and first principles of necessary truth, on the other. As Van Cleve (1999) points out, just because the former type of principles have contingent truths as their contents, this does not mean that the principles themselves are, in any way, less necessary than those of necessary truths. It is the truths themselves that are either necessary or contingent:

The truths that fall within the compass of human knowledge, whether they be self-evident, or deduced from those that are self-evident, may be reduced to two classes. They are either necessary and immutable truths, whose contrary is impossible, or they are contingent and mutable, depending upon some effect of will and power, which had a beginning, and may have an end. (EIP VI. 5, p. 468)

Since this article is concerned with the main tenets of Reid’s philosophy of mind, first principles are interesting for this purpose only in as much as they are discovered by a faculty—common sense—with which every human being is supposed to be endowed, and they will not be discussed in more detail.

iv. Reasoning

If the first principles of common sense are discovered by employing the operation of intuitive judging, reasoning proper is to be employed to discover whatever conclusions follow from self-evident principles. Since there are two classes of first principles, Reid argues that there are two types of reasoning. Demonstrative reasoning is employed to draw conclusions that follow from the first principles of necessary truths, whereas probable reasoning is employed to draw conclusions that follow from the first principles of contingent truths (EIP VII. 3, p. 556).

The strength of demonstrative reasoning, which is commonly employed in mathematics and logic, is such that for showing that a conclusion follows from some axioms (or first principles) nothing else needs to be done other than offering one demonstration. Reid thinks that it would be superfluous to try to give several different demonstrations to prove one conclusion, while employing demonstrative reasoning, even though a variety of proofs may be available in practice:

To add more demonstrations of the same conclusion, would be a kind of tautology in reasoning; because one demonstration, clearly comprehended, gives all the evidence we are capable of receiving. (EIP VII. 3, p. 556)

It is not so with probable reasoning:

The strength of probable reasoning …depends not upon any one argument, but upon many, which unite their force, and lead to the same conclusion. Any one of them by itself would be insufficient to convince; but the whole taken together may have a force that is irresistible, so that to desire more evidence would be absurd. (EIP VII. 3, p. 556)

Probable reasoning is the method of choice for all the natural sciences, whose true propositions are contingent. According to Reid, probable reasoning comes in degrees, whereas demonstrative reasoning does not admit degrees; it is absolute.

In every step of demonstrative reasoning, the inference is necessary, and we perceive it to be impossible that the conclusion should not follow from the premises. In probable reasoning, the connection between the premises and the conclusion is not necessary, nor do we perceive it to be impossible that the first should be true while the last is false. (EIP VII. 1, p. 544-45)

Although Reid argues that probable reasoning is of a different kind than demonstrative reasoning (EIP VII. 3, p. 557), according to Lehrer (1989: 174), probable reasoning can lead to conclusions that are certain. Reid thinks that the vulgar is mistaken when contrasting probable reasoning with certainty. Probable reasoning, according to Reid, has degrees of evidence, “from the very least to the greatest which we call certainty” (EIP VII. 3, p. 557).

Hume, in the Treatise, argues that all knowledge should be reduced to probability, because human beings are fallible creatures, endowed with fallible faculties. Reid’s understanding of probable reasoning as a type of reasoning that leads to certain conclusions constitutes a direct refutation of Hume’s argument. The problem, Reid points out, is that requiring a proof of the reliability of the human faculties would be circular, because it could be given only by using those reasoning powers themselves, “and is therefore that kind of sophism which Logicians call petitio principii” (or “begging the question”) (EIP VII. 4, p. 571). Hume writes that “[n]ature, by an absolute and uncontrollable necessity, has determined us to judge, as to breathe and feel” (Hume, Treatise I.iv.1, p. 183). Reid agrees with Hume in part: probable reasoning concerning cause and effect, for instance, is the result of an innate principle of human constitution. Such a principle is known to be true, by intuition, and by exercising the faculty of common sense. But Reid also disagrees with Hume, and points out that probable reasoning concerning cause and effect is not merely a matter of custom. The relevant first principle of contingent truth allows human beings to be certain that effect follows its cause, not because they reason that it is so, but because they judge (intuitively) that it is so.

5. Taste

Reid considers the principles of the so-called “internal taste” in Essay VIII, the last of the EIP. Contemporary philosophy of mind is mostly silent concerning the way human beings interact and appreciate works of art; the widespread belief seems to be that such issues belong to value theory rather than to the philosophy of mind proper. Reid, however, is part of a different tradition, which sought to explain the interest humans have in art and its artifacts, and consequently the interactions humans seek with said artifacts starting by observing human psychology. As such, he, just like some of his predecessors (for example, Hume, Hutcheson, and Shaftesbury), thinks that adult human beings are endowed with a special faculty, taste, which is supposed to help them appreciate beautiful or aesthetically relevant things, and disapprove those that are found to be lacking the sought-after qualities. Reid is thus mostly describing and analyzing the aesthetic experience, rather than addressing issues that are relevant from the point of view of the philosophy of art. In the course of doing this, however, he is interested in questions pertaining to art and artworks. Reid has an expression theory of art, in that he is interested in how art can express emotion, or, better still, how artists can and do express emotions through an artistic medium. If art is a sort of language, the faculty of taste, as applied to the aesthetic qualities of artworks, is the way to be made privy to this language: by employing this faculty, human beings become sensitive to the signs and decode their meaning. However, this is not the only way people employ their internal sense: by using this faculty they also become sensitive to the aesthetic qualities of the world. Reid’s idea is that just like a painter is expressing an emotion in his works, God is expressing certain emotions in his works. One cannot gain complete knowledge of the external world, in this picture, unless one understands and appreciates the beauty of the world.

a. Why This Faculty Is Called “Internal Taste”

This name indicates that the faculty itself is of the same kin as the other type of taste, but in what sense is it “internal”?  To better understand this, consider the distinction that Reid draws between things internal and things external to the mind at the beginning of the EIP:

When…we speak of things in the mind, we understand by this, things of which the mind is the subject. Excepting the mind itself, and things in the mind, all other things are said to be external. (EIP I. 1, p. 22)

This distinction is as elucidating as it is confusing: since both types of taste are operations of the mind, they both are, in a sense, internal. However, Reid’s idea is that the “external taste” is supposed to help those beings that have it register information about certain pleasing and displeasing qualities of food and drink. The objects that can be food and drink are external to the mind—they are physical things to be found in the world. So, by analogy, it should probably be thought that “the internal taste” is supposed to help those beings that are endowed with it register information about certain pleasing and displeasing qualities of internal objects—namely, minds and their qualities.

Reid does not argue that other minds can be directly perceived, but he takes it to be a first principle of common sense that other minds exist (the 8th first principle of contingent truths, EIP VI. 5, p. 482-483), and that people learn of their existence by correctly deciphering certain signs. This interpretation of natural signs is innate, since, Reid claims, even small children respond in the correct (that is, expected) way in the presence of an angry parent, for instance. In this picture, the internal sense of taste is meant to discern the quality of excellence that other minds possess, in addition to enhancing the knowledge people have of “the existence of life and intelligence in our fellow-men.” To do so, however, the internal taste orients itself to material objects (since it cannot directly interact with other minds), and identifies that which is beautiful, in nature and in the fine arts (EIP VIII. 1, p. 573).

b. An Objectivist Account of Beauty

Putting everything together, here is the picture that emerges: Reid believes that beauty is a property both of objects and of minds. Moreover, he thinks that beauty itself is both a primary and a secondary quality of objects. Reid’s claim that beauty is a real property of objects directly opposes the idea that beauty is just a feeling in an agent’s mind, advanced by Hume and Hutcheson. As in morals, in the domain of aesthetic value, Reid is an objectivist (at least, according to Benbaji (1999)). The aesthetic (or internal) taste has the dual role of discovering what material objects are beautiful, and, indirectly, what minds, which created those beautiful objects, are inherently beautiful. Beauty, in this picture, is not a feeling in one’s mind, but something external to one’s mind. The internal taste is used to reach aesthetic judgments by evaluating material objects, which express the mental attributes of the artist. Without excellence in the mind, no product of that mind can be perceived as beautiful. Beauty is thus a property of the artist’s mind, and is displayed by the artifacts he creates only in a derivative sense. The internal taste functions very much like perception of external objects: certain signs of aesthetic qualities function to trigger a conception and belief in the existence of the aesthetic quality in question. The internal taste is thus assimilated to the external sense of taste, since both senses are supposed to contribute to the perception of specific qualities of objects.

6. References and Further Reading

a. Primary Sources

  • Hume, D. (2007). A Treatise of Human Nature. Oxford: Clarendon Press. (Original work published in 1739-40.)
    • The standard edition of Hume’s Treatise.
  • Hume, D. (1874-75). “Of the Standard of Taste,” in vol. 3 of The Philosophical Works of David Hume. Edited by T. H. Green and T. H. Grose. 4 volumes, London: Longman, Green.
    • Hume considers whether there can be any objective standard of taste.
  • Hutcheson, F. (2004). An Inquiry into the Original of Our Ideas of Beauty and Virtue. Edited by W. Leidhold. Indianapolis: Liberty Fund. (Original work published in 1726.)
    • This presents Hutcheson’s sentimentalist understanding of beauty.
  • Locke, J. (1979). An Essay Concerning Human Understanding. Oxford: Clarendon Press. (Original work published in 1700.)
    • This is the standard edition of Locke’s Essay.
  • Reid, T. (1997) An Inquiry into the Human Mind on the Principles of Common Sense. Edited by Derek R. Brookes. Edinburgh, UK: Edinburgh University Press. (Original work published in 1764.)
    • This is the standard edition of Reid’s Inquiry. Cited in text as IHM, chapter, section, page number. Cited in text as Essay, book, chapter, section number.
  • Reid, T. (2002) Essays on the Intellectual Powers of Man—A Critical Edition. Edited by Derek R. Brookes. Edinburgh, UK: Edinburgh University Press. (Original work published in 1785.)
    • This is the standard edition of Reid’s work on the intellectual powers. Cited in text as EIP, essay, chapter, page number.
  • Reid, T. (2010) Essays on the Active Powers of Man—A Critical Edition. Edited by Knud Haakonssen and James A. Harris. Edinburgh, UK: Edinburgh University Press. (Original work published in 1788.)
    • This is the standard edition of Reid’s published work on action theory.

b. Secondary Sources

  • Alston, W. P. (1989). “Reid on Perception and Conception.” In M. Dalgarno, & E. Matthews (Eds.) The Philosophy of Thomas Reid, (pp. 35–47). Dordrecht: Kluwer.
    • Argues that conception, despite its name, does not involve the use of any concepts.
  • Benbaji, H. (1999). “Reid’s View of Aesthetic and Secondary Qualities.” Reid Studies 2, 31-46.
  • Buras, T. (2005). “The Nature of Sensations in Reid.” History of Philosophy Quarterly, 22(3), 221–238.
    • Interprets Reid as saying that sensations are reflexive acts of the mind, taking themselves as objects.
  • Buras, T. (2008). “Three Grades of Immediate Perception: Thomas Reid’s Distinctions.” Philosophy and Phenomenological Research, 76(3), 603–632.
    • Explains that there are three senses of “immediacy,” in Reid, making clear the connection between immediacy and original perception, and acquired perception.
  • Buras, T. (2009). “The Function of Sensations in Reid.” Journal of the History of Philosophy, 47(3), 329–353.
    • Explains what function sensations perform: primarily, they give sentient beings information about how they react to the environment.
  • Copenhaver, R. (2000). “Thomas Reid’s Direct Realism.” Reid Studies, 4(1), 17–34.
    • Explains Reid’s account of perception, classifying it as direct realism.
  • Copenhaver, R. (2004). “A Realism for Reid: Mediated but Direct.” British Journal for the History of Philosophy, 12(1), 61–74.
    • Explains the intermediary role of sensations in the chain of perception.
  • Copenhaver, R. (2010). “Thomas Reid on Acquired Perception.” Pacific Philosophical Quarterly, 91(3), 285–312.
    • Offers a compelling argument to show that acquired perception is indeed a form of perception, and not reasoning.
  • Copenhaver, R. (2006a). “Thomas Reid’s Philosophy of Mind: Consciousness and Intentionality.” Philosophy Compass, 1(3), 279–289.
    • Offers a comprehensive explanation of Reid’s philosophy of mind, centered on the concept of intentionality.
  • Copenhaver, R. (2006b). “Thomas Reid’s Theory of Memory.” History of Philosophy Quarterly, 23(2), 171–187.
    • Discusses the ways in which memory gives people direct knowledge of the past, according to Reid.
  • Copenhaver, R. (2009). “Reid on Memory and Personal Identity.” Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/entries/reid-memory-identity/
    • Offers a comprehensive account of Reid’s theory of memory.
  • Falkenstein, L. (2004). “Nativism and the Nature of Thought in Reid’s Account of Our Knowledge of the External World”. In T. Cuneo, & R. Van Woudenberg (Eds.), The Cambridge Companion to Reid, (pp. 156–179). Cambridge: Cambridge University Press.
    • Explains Reid’s brand of nativism, which allows him to keep fixed certain principles which are dear to the British Empiricists.
  • Falkenstein, L. and Giovanni Grandi (2003). “The Role of Material Impressions in Reid’s Theory of Vision: A Critique of Gideon Yaffe’s ‘Reid on the Perception of the Visible Figure.’’’ Journal of Scottish Philosophy, 1(2), 117-133.
    • Argue that no sensations are involved in the perception of visible figure.
  • Folescu, M. (2015). “Perceiving Bodies Immediately: Thomas Reid’s Insight.” History of Philosophy Quarterly, 32(1), 19–36.
    • Argues that bodies are objects of original perception, despite perceivers’ gaining only relative (that is, not direct) notions of them by the use of their senses.
  • Folescu, M. (2015). “Perceptual and Imaginative Conception.” In Todd Buras and Rebecca Copenhaver (eds.), Mind, Knowledge and Action: Essays in Honor of Reid’s Tercentenary, (pp. 52–74). Oxford: Oxford University Press.
    • Argues that Reid should have been sensitive to the fact that conception is not employed in the same manner by the perceptual and by the imaginative systems, respectively.
  • Folescu, M.  “Thinking About Different Nonexistents Of The Same Kind.” Published online first in Philosophy and Phenomenological Research. DOI: 10.1111/phpr.12196
    • Argues that Reid’s account provides the tools for entertaining singular imaginings of different fantastical creatures of the same kind.
  • Gallie, R. (1997). “Reid: Conception, Representation and Innate Ideas.” Hume Studies, 23(2), 315-35.
    • Argues that conception requires linguistic representation.
  • Ganson, T. (2008). “Reid’s Rejection of Intentionalism.” Oxford Studies in Early Modern Philosophy, 4, 245–263.
    • Argues that sensation is not intentional: it is not about any objects, be those objects the sensations themselves.
  • Kivy, P. (2004). “Reid’s Philosophy of Art.” In T. Cuneo, & R. Van Woudenberg (Eds.) The Cambridge Companion to Reid, (pp. 267–312). Cambridge: Cambridge University Press.
    • Argues that Reid is one of the first philosophers interested in philosophy of art, rather than aesthetics, in general.
  • Kivy, P. (1978). “Thomas Reid and the Expression Theory of Art.” The Monist, 61(2), 167–183.
    • Argues that Reid has, primarily, an expression theory of the arts: artworks express the emotions of their creators.
  • Kroeker, E. R. (2010). “Reid on Natural Signs, Taste and Moral Perception.” In S. Roeser (Ed.), Reid on Ethics: Philosophers in Depth, (pp. 46–66). Palgrave Macmillan.
    • Argues that original beauty and other aesthetic qualities are intrinsic qualities of minds.
  • Lehrer, K. (1978). “Reid on Primary and Secondary Qualities.” The Monist, 61(2), 184–191.
    • Presents and defends the distinction between these two types of properties of objects.
  • Lehrer, K. (1989). Thomas Reid. London and New York: Routledge.
    • Offers a comprehensive exposition of Reid’s philosophy.
  • Manns, J. W. (1988). “Beauty and Objectivity in Thomas Reid.” British Journal of Aesthetics, 28, 119–131.
    • Argues that beauty is objective, for Reid, on the principles of common sense, but not objective, on the correct philosophical principles.
  • Nauckhoff, J. C. (1994). “Objectivity and Expression in Thomas Reid’s Aesthetics.” Journal of Aesthetics and Art Criticism, 52, 183–191.
    • Argues that minds are excellent, hence beautiful, and that any other object deemed beautiful has that quality in virtue of being a sign of some excellence.
  • Nichols, R. (2002). “Reid on Fictional Objects and The Way of Ideas.” The Philosophical Quarterly, 52(209), 582–601.
    • Argues that Reid’s rejection of the “way of ideas” leads him to adopt a form of moderate Meinongeanism, before Meinong.
  • Nichols, R. (2007). Thomas Reid’s Theory of Perception. Oxford: Oxford University Press.
    • Analyzes the major tenets of Reid’s theory of perception.
  • Pappas, G. S. (1989). “Sensation and Perception in Reid.” Noûs, 23(2), 155–167.
    • Defends the distinction between sensation and perception in Reid; a classic piece in Reid studies.
  • Rysiew, P. (1999). “Reid’s [Mis]charaterization of Judgment.” Reid Studies 3(1), 63–68.
    • Argues that, despite his official characterization, “judgment,” for Reid, should be understood to mean reflection.
  • Tulving, E. (1983). Elements of Episodic Memory. Oxford: Oxford University Press.
    • Explains what types of memory there are, and why episodic memory is fundamental.
  • Van Cleve, J. (1999). “Reid on the First Principles of Contingent Truths.” Reid Studies 3, 3–30.
    • Argues that the first principles of contingent truths allow Reid to be a reliabilist with regard to the cognitive faculties of human beings, without any kind of circularity.
  • Van Cleve, J. (2004). “Reid’s Theory of Perception.” In T. Cuneo, & R. Van Woudenberg (Eds.) The Cambridge Companion to Reid, (pp. 101–133). Cambridge: Cambridge University Press.
    • A comprehensive account of Reid’s theory of perception, with special care given to identifying Reid’s type of realism: direct or indirect. This is the best starting point for anyone interested in getting a better understanding of Reid’s theory of perception.
  • Van Woudenberg, R. (1999). “Thomas Reid on Memory.” Journal of the History of Philosophy, 37(1), 117–133.
    • Discusses the elements of Reid’s theory of memory.
  • Van Woudenberg, R. (2004). “Reid on Memory and the Identity of Persons.” In T. Cuneo, & R. Van Woudenberg (Eds.) The Cambridge Companion to Thomas Reid, (pp. 204–221). Cambridge: Cambridge University Press.
    • Discusses the role of memory in personal identity.
  • Wolterstorff, N. (2001). Thomas Reid and the Story of Epistemology. Cambridge: Cambridge University Press.
    • Explains Reid’s terminology and way of thinking such that contemporary epistemologists can see Reid as an exponent and precursor of some of the issues discussed today.
  • Yaffe, G. (2003a). “The Office of an Introspectible Sensation: A Reply to Falkenstein and Grandi.” Journal of Scottish Philosophy, 1(2), 135–140.
    • Responds to the criticisms raised by Falkenstein and Grandi to the idea that all kinds of perceptions, including the perception of visible figure, involve sensations.
  • Yaffe, G. (2003b). “Reid on the Perception of Visible Figure.” Journal of Scottish Philosophy, 1(2), 103–115.
    • Argues that perceiving the visible figure of objects, for Reid, involves having sensations of color.

 

Author Information

Marina Folescu
Email: folescum@missouri.edu
University of Missouri
U. S. A.

Demonstratives and Indexicals

In the philosophy of language, an indexical is any expression whose content varies from one context of use to another. The standard list of indexicals includes pronouns such as “I”, “you”, “he”, “she”, “it”, “this”, “that”, plus adverbs such as “now”, “then”, “today”, “yesterday”, “here”, and “actually”. Other candidates include the tenses of verbs, adjectives such as “local”, and a range of expressions such as “yea” or “so” as used in constructions such as “yea big” (said, for example, while holding one’s hands two feet apart). Certain indexicals, often called “pure indexicals”, have their content fixed automatically in a context of use in virtue of their meaning. “I”, “today”, and “actually” are common examples of pure indexicals. Other indexicals, often called “true demonstratives,” require some kind of additional supplementation in a context in order to successfully refer in the context. The demonstrative pronouns “this” and “that” are clear examples of true demonstratives, because they require something of the speaker—some kind of gesture, or some kind of special intention—in order to resolve what the speaker is referring to. Which expressions are pure indexicals and which are true demonstratives is itself a matter of controversy. (The terms “pure indexical” and “true demonstrative” are due, as with so much else on this topic, to David Kaplan.)

Contemporary philosophical and linguistic interest in indexicals and demonstratives arises from at least four sources. (i) Indexical singular terms such as “I” and true demonstratives such as “that” are perhaps the most plausible candidates in natural language for the philosophically controversial theory of direct reference (see section 3e). (ii) Indexicals and demonstratives provide important test cases for our understanding of the relationship between linguistic meaning (semantics) and language use (pragmatics). (iii) Indexicals and demonstratives raise interesting technical challenges for logicians seeking to provide formal models of correct reasoning in natural language. (iv) Indexicals raise fundamental questions in epistemology about our knowledge of ourselves and our location in time and space.

By far the most influential theory of the meaning and logic of indexicals is due to David Kaplan. Almost all work in the philosophy of language (and most work in linguistics) on indexicals and demonstratives since Kaplan’s seminal essay “Demonstratives” has been a development of or response to Kaplan’s theory. For this reason, the majority of this article focuses on the details of Kaplan’s theory. Before introducing Kaplan’s theory, however, it discusses the most important precursors to Kaplan, some of whose views have been revived and given new defenses in light of Kaplan’s work.

Table of Contents

  1. Some Preliminaries
    1. Expressions and Utterances
    2. Types and Tokens
    3. Occurrences
  2. Precursors to Kaplan’s Theory
    1. Peirce on Indexical Signs
    2. Russell on Egocentric Particulars
    3. Reichenbach on Token-Reflexives
    4. Burks on Indexical Symbols
    5. Objections to Utterance-based Theories
  3. Kaplan’s Semantic Theory of Indexicals
    1. Background and a Basic Insight
    2. Character, Context, and Content
    3. Truth Relative to a Context
    4. Indexicality and Modality
    5. Some Consequences of Kaplan’s Theory of Indexicals
  4. True Demonstratives
    1. Two Challenges Posed by True Demonstratives
    2. Reference Fixing for True Demonstratives
    3. Adding True Demonstratives to Kaplan’s Theory
    4. David Braun’s Context-Shifting Semantics for True Demonstratives
  5. Kaplan’s Logic of Indexicals
    1. The Core Idea of Kaplan’s Logic
    2. Kaplan’s Other Semantic Theory
  6. Objections to Kaplan’s Semantic Theory and Logic
    1. Objections to Direct Reference
    2. Objections to Kaplan’s Treatment of Contexts
    3. Objections to Kaplan’s Logic
  7. Alternatives to Kaplan’s Theory of Indexicals
    1. John Perry’s Reflexive-Referential Theory
    2. Expression-Based Alternatives
  8. References and Further Reading

1. Some Preliminaries

Indexicals are words or phrases. To talk carefully about them, we need some resources for talking carefully about words and phrases. There are more distinctions here than may be apparent at first glance. In the case of indexicals and demonstratives, some of these distinctions are crucial.

a. Expressions and Utterances

Suppose that a speaker, Greg, utters the sentence “I am hungry”. We can distinguish between the action that Greg has performed—the utterance—and the sentence or expression that Greg has uttered. If Molly also utters “I am hungry”, then Molly and Greg have uttered the same sentence, but they have performed different actions. There is also a way of talking about actions on which we can say that Molly and Greg have performed the same action—they have both uttered “I am hungry”—but this is not the way we will talk about actions here. As we will use the term, an utterance is a particular event that occurs at a particular time and place. In this sense, Greg’s utterance and Molly’s utterance are distinct events, because they occurred at different places (and perhaps at different times).

We will also generalize our use of “utterance” so that it refers to inscriptions—acts of writing sentences—as well as to acts of speaking. So if Greg and Molly each write “I love you” on a sheet of paper, we will say that they have performed different (though similar) utterances. Yet in this case as well, they have written the same sentence. This slight extension of the standard use of “utterance” is common in discussions of indexicals and demonstratives. As we will see below, written notes provide interesting test cases for certain theories of indexicals.

b. Types and Tokens

It is also important to distinguish an utterance from the particular concrete instance of a sentence, word, or phrase that is produced or used in the course of an utterance. This distinction is easiest to see in the case of writing, where an act of writing produces some concrete thing—ink or graphite marks on a page, chalk marks on a blackboard, a specific distribution of pixels on a screen, and so forth. Following Charles Sanders Peirce, philosophers call these concrete instances of words, phrases, or sentences tokens. Tokens can also take the form of particular patterns of sound, as in the case of spoken language, and here again, it is important to distinguish the act of producing a particular pattern of sound—an utterance—from the particular pattern of sound produced—a token.

In our examples involving Greg and Molly above, we said that Greg and Molly each uttered the same sentence. This means that what we are calling the sentence that Greg and Molly both uttered is not the same thing as either of the tokens that they have produced. Again following Peirce, we will say that the tokens that Greg and Molly have each produced are instances or tokens of the same sentence type. Similarly, Greg and Molly have each produced tokens of the word type “I”. While tokens are concrete things, types are abstract. While tokens are located in particular places in space and time, types are not located anywhere.

The precise status and nature of types is a difficult question. Here are just two examples of the kinds of puzzles that arise when one begins to think about types versus tokens. (i) Are types universal? They seem to be, given that they are abstract objects that are in some sense instantiated by their tokens. (ii) In virtue of what are two tokens of the same type? In some cases, this may seem straightforward: if you are viewing this article on two different screens (or perhaps on a screen and a printed copy), you see two tokens of this sentence that are orthographically very similar to one another. But what about a token of “I am hungry” written by hand in a cursive script on a piece of paper, and another produced by Greg speaking the sentence? In virtue of what are these tokens of the same type? They have little in common in virtue of which we can say that they are similar. Despite these difficulties, we will continue to talk about tokens and types in the ways outlined above.

In what follows, the terms “word”. “phrase”, “sentence”, and “expression” will refer to types. Whenever we need to refer to particular tokens, we will use phrases such as “the token of the sentence ‘I am hungry’ produced by Greg”. Some philosophers are not always as careful about this usage as they could be, and anyone who wants to read further in the literature on the topic is warned to pay careful attention to how different philosophers talk about language. Some philosophers use a convention whereby putting a numeral before a sentence, as in the case of (1), allows them to use that numeral to refer to the sentence.

(1) I am hungry.

This is the convention that we follow in this article. Thus, our examples above involve different cases of speakers uttering (1) by producing different tokens of it. But other philosophers will use the numeral to refer to some hypothetical utterance of the sentence, and others to the token produced in such an utterance.

One further point is sometimes important in discussions of indexicals: an utterance of a sentence need not involve the production of a token of that sentence. For example, I might write a note on the top sheet of a Post-it pad that says “I will return at 2:30” and post it on my office door. Here I have produced a token of the sentence “I will return at 2:30”. But next week, I might use the same sheet again, by reposting it to my office door. This time, it seems that I am uttering the sentence “I will return at 2:30” by using a token that I produced earlier. Thus, when we speak of utterances, we will also mean to include cases like this in which an agent uses a previously produced token.

c. Occurrences

The distinctions above, between utterances and expression types and tokens, are common in discussions of language. There is one other category, however, that should be borne in mind when thinking about indexicals and demonstratives. To see this, consider first the kind of question that is commonly used to introduce the distinction between types and tokens:

How many words are written between the following pair of tokens of quotation marks: “a rose is a rose”?

The question here may be taken in different ways: three words have been written, but two of those words have been written twice. Thus, in the token of “a rose is a rose” above, there are two tokens of “a” and two of “rose”. So if you were to mark off the number of times that any token of a word appears between the tokens of the quotation marks above, you would count five tokens.

Now consider the question

How many words are in the sentence “a rose is a rose”?

Here there is only one correct answer. There are three words in the sentence: “a”, “rose”, and “is”. We can, however, say something else: two of these words occur twice in the sentence. This is not to say that there are two tokens of “a” and of “rose” in the sentence. That would be a mistake: the sentence is an abstract type, and tokens are concrete particulars. Instead of distinguishing between different tokens of “a” and of “rose” in the sentence, we distinguish between the different occurrences of “a” and of “rose” in the sentence. So there are three words in the sentence, but there are five occurrences of words

Occurrences, like types, but unlike tokens, are abstract. An occurrence of a word or phrase e within a larger phrase E may be thought of as a state of affairs: the state of affairs of e being located at a particular place in the structure of E. Thus, the two occurrences of “rose” in “a rose is a rose” are distinguished from each other according to where in the structure of “a rose is a rose” the word “rose” is located.

Despite the importance of distinguishing between occurrences and tokens, there are systematic relations between them. It is precisely because the sentence “a rose is a rose” contains two occurrences of “rose” that any token of the sentence will contain two tokens of “rose”. This relation will be important when we turn to theories of true demonstratives.

2. Precursors to Kaplan’s Theory

In the 20th century, there have been two basic approaches to the semantics of indexicals and demonstratives: utterance-based and expression-based theories. Almost all of the theories prior to David Kaplan’s influential theory have been utterance based. In early attempts to elaborate such theories, however, philosophers did not always pay due attention to the distinction above between utterances and the tokens produced (or used) in those utterances. The below discussion largely follows the original philosophers’ terminology, departing from it only to clarify where it is important to point out that they have elided the distinction between tokens and utterances.

a. Peirce on Indexical Signs

The term “indexical” is due originally to Charles Sanders Peirce, who introduced it as part of a threefold theory of signs. In this theory, Peirce distinguished between icons, indices, and symbols. All signs, on Peirce’s view, have the basic function of representing some object to some cognitive agent, but different kinds of signs accomplish this function in different ways. Icons represent an object to an agent by exhibiting or displaying to the agent the properties of the object they represent. A clear example of this is a diagram of a machine, which represents visually both the shapes of the parts and the structure of the machine.

Indices represent by standing in some kind of intimate relation to their objects. Peirce calls these relations “existential relations”, because indices cannot represent objects unless those objects exist to stand in the appropriate relations to them. Indices are a fundamental part of Peirce’s theory, but for Peirce, existential relations are easy to come by. This is because many causal relations count, for Peirce, as existential relations. As an example of an index, Peirce considers a hole in a wall: one can infer from the hole the existence of a gunshot in the room. Thus, the hole is an index of the gunshot.

As this example makes clear, indices in Peirce’s theory by themselves have little to do with language, or indeed with representation in any obvious sense. Indices in Peirce’s theory exhibit what H. P. Grice would later call natural meaning, wherein the presence of one state of affairs is a reliable indicator of the presence of another. Grice’s famous examples include that smoke means fire, and that presence of a certain rash means measles. Yet neither of these cases is plausibly an example of representation: the presence of smoke does not represent the presence of fire, nor does the presence of a particular rash represent the presence of measles.

Symbols, finally, represent their objects in virtue of conventions or rules that state that they stand in for those objects. Thus on Peirce’s view, all words of a language are symbols, because all words have their meanings conventionally. But some words are also indices. Peirce cites the demonstrative pronouns “this” and “that” as examples. On Peirce’s view, the conventional rules governing “this” and “that” dictate that a speaker can use them to refer to objects in the immediate perceptual environment. The audience of a successful use of a demonstrative can infer the existence of an object referred to—an “existential” relation. If the audience cannot infer the existence of an object referred to, then the use of a demonstrative has not been successful. Thus, demonstrative pronouns are both symbols (governed by conventional rules) and indices (representing objects in virtue of the existential relations they bear to those objects).

b. Russell on Egocentric Particulars

Bertrand Russell calls words like “I”, “here”, “now”, and so forth egocentric particulars. In Russell’s theory, all such expressions can be analyzed as descriptions involving the demonstrative pronoun “this”. So, for Russell, “now” means “the time of this” and “here” means “the place of this”. Russell offers different analyses of “I”, proposing at one time that it means “the person experiencing this”, and at another time that it means “the biography to which this belongs”. Thus on Russell’s analysis, all egocentric particulars can be reduced to one, and the status of egocentric particulars turns on the status of “this” (about which Russell held conflicting views at different times). According to Russell, this analysis of egocentric particulars captures an important feature of their use: that the reference (or denotation) of a particular utterance of an indexical is always relative to the speaker (and perhaps the time) of the utterance.

Yet Russell’s analysis fails on precisely the grounds that the interpretation of a particular utterance of “this” is not fixed merely by the identity of the speaker and the time of the utterance. This is because, as we see later, speakers can use “this” to refer to different things in their immediate environment. What a speaker refers to using “this” depends on some further feature of the context of the use: either the speaker makes some gesture, or there is enough common knowledge in the background that the speaker’s audience can identify what object the speaker intends to refer to (see section 4b below).

c. Reichenbach on Token-Reflexives

One of the most developed and influential theories of indexicals prior to Kaplan is due to Hans Reichenbach. Reichenbach’s theory is, in many ways, similar to Russell’s, but Reichenbach offers both a more sophisticated analysis of individual indexical expressions, and a more subtle treatment of the principles underlying the analysis. The key to both of these is Reichenbach’s emphasis on tokens in his analysis.

Reichenbach calls indexical expressions “token-reflexives”. The reason for this is clear on even an informal statement of Reichenbach’s view: the indexical “I” means “the person who utters this token”, “here” means “the place at which this token is uttered”, “now” means “the time at which this token is uttered”, and so forth. Token-reflexive expressions are thus expressions whose meaning is in some way keyed to individual tokens of them. (Though Reichenbach’s official theory is stated in terms of types and tokens, some passages in Reichenbach’s Elements of Symbolic Logic suggest that he was thinking of utterances rather than tokens. Contemporary defenders of Reichenbach-inspired views adopt this variation—see section 7a and García-Carpintero.)

Even on this informal statement, Reichenbach’s view clarifies to some degree the role of “this” in Russell’s analysis of egocentric particulars: a particular utterance of an indexical must refer to a token. Yet without further elaboration, this statement of Reichenbach’s view would be subject to the same problem as Russell’s, because it is undetermined which token is supposed to be referred to. If I utter “I am the person who uttered this token”, while pointing at a token of a sentence that someone else wrote on a chalkboard, then I have said something false.

This worry is allayed by a closer examination of the details of Reichenbach’s analysis. Suppose that Bertrand Russell utters (2):

 (2) I am a philosopher.

In so doing, Russell has produced a token of “I”. Call this token t1. Then on a more careful statement of Reichenbach’s view, Russell’s utterance of (2) means the same thing as (3):

 (3) The person who utters t1 is a philosopher.

Since Russell is the person who utters t1, and Russell is a philosopher, Russell’s utterance is true. This shows that our rough translation of “I” above as “the person who utters this token” was incomplete. It is more correct (though on Reichenbach’s view, still not strictly correct—see below) to say that the meaning of “I” is such that any token t of “I” refers to t itself. Thus unlike Russell, who reduced all indexicals—Russell’s egocentric particulars—to the demonstrative pronoun “this”, Reichenbach reduces all indexicals—Reichenbach’s token-reflexives—to a very special kind of token-reflexive operation.

The token-reflexive operation that forms the basis of Reichenbach’s analysis is the special technical device of “token-quotes”—the pair of arrows “” and “” that Reichenbach introduces in his analysis of the phrase “this token”. For Reichenbach, the result of enclosing a token in token-quotes, as in

a,

produces a token that refers to the token of “a” enclosed in the quotes. The emphasis on “token” in the previous sentence is important, because the token below refers to a different token of “a”:

a.

Call these “token-quote phrases”. The above examples show that on Reichenbach’s view, no two tokens of a token-quote phrase can refer to the same thing. As a result, we cannot talk about the meaning of a token-quote phrase, because there is no meaning that any two tokens of the phrase share. For this reason, Reichenbach calls token-quote phrases “pseudo-phrases”. Since token-quote phrases are the foundation of Reichenbach’s analysis of indexicals, all indexicals are similarly pseudo-phrases. As a result, it is strictly speaking incorrect, on Reichenbach’s view, to talk about the meaning of an indexical.

One consequence of this view is that different utterances of (2), even by the same person, will strictly speaking mean different things. Suppose that Russell utters (2) a second time. In doing so, Russell has produced a separate token of “I”. Call this token t2. On Reichenbach’s view, Russell’s second utterance of (2) means the same thing as (4):

(4) The person who utters t2 is a philosopher.

This consequence of Reichenbach’s view is counter to our intuitions about the use of (2): if Russell uses (2) twice, Russell has said the same thing about himself. On Reichenbach’s view, Russell said two different things about two different tokens of “I”. Yet because in both cases, it was Russell who did the uttering, the truth of what Russell said in each case turns on whether Russell a philosopher. Thus, Reichenbach’s analysis gets the right truth conditions for an utterance of (2), but at the expense of certain intuitions about the meaning of “I”.

Reichenbach’s view has a further odd consequence, noted by David Kaplan. Suppose that I utter (5), and let “t3” name the token of “I” that I have produced in so doing:

(5) If no one were to utter t3, then I would not exist.

According to Reichenbach’s analysis, my utterance of (5) means the same thing as (6):

 (6) If no one were to utter t3, then the person who utters t3 would not exist.

But (6) is plausibly a logical truth. Thus on Reichenbach’s view, my utterance of (5) is true as a matter of logic. Yet my utterance of (5) is clearly false: had I not uttered (5), I would nonetheless have continued to exist.

d. Burks on Indexical Symbols

In the article “Icon, Index, and Symbol”, Arthur Burks develops Peirce’s suggestive remarks about indexical words into a more systematic theory of their meanings. Burks’s theory also addresses some of the odd consequences of Reichenbach’s theory noted above (though it is unclear whether Burks was familiar with Reichenbach’s view). Thus, Burks’s theory represents a culmination of several different strands of thought concerning indexicals prior to Kaplan’s work.

On Burks’s theory, all expression types of a given language have what Burks calls symbolic meaning. This is the meaning of the expression type determined by the conventions governing the language. All tokens of a given expression type share the symbolic meaning of the type. The difference between indexical expressions and non-indexical expressions is in the meanings of individual tokens. For non-indexical expressions, the meaning of an individual token just is the symbolic meaning of the type of which it is a token. For indexical expressions, in contrast, the symbolic meaning of the expression type is only part of the meaning of each individual token of that type. The full meaning of a token of an indexical expression includes information about the token itself—where and when it exists, who produced it, and so forth. Burks calls this full meaning of a token of an indexical expression the indexical meaning of the token. So different tokens of an indexical expression differ in indexical meaning, but their different indexical meanings all have the symbolic meaning of the indexical expression in common.

For Burks, the indexical meaning of a token is what someone must know about that token in order to determine what that token represents. On Burks’s view, the indexical meaning of a token of an indexical expression comprises all of the following:

 (i) the spatiotemporal location of the token;

(ii) a description of the object that the token represents; and

(iii) a set of what Burks calls “directions” that relate the token to the object it represents.

The directions in (iii) can arise in two different ways, either (a) as encoded in the symbolic meaning of the type of which the token is an instance, or (b) as determined by an act of pointing, or some similar gesture on the part of the person who produces or uses the token. Elements (ii) and (iiia) of the indexical meaning of a token are supplied by the symbolic meaning of the type of which the token is an instance. These will be shared by all tokens of the same type of indexical expression. Elements (i) and (iiib) are supplied by an individual’s knowledge of the token and its production or use. These will vary from one token to another.

Though Burks does not examine the question in detail, it appears that the importance of the individual elements of (i-iii) can vary from one indexical to another. For example, in the case of an utterance of the indexical “I”. someone may fully understand the utterance without knowing the spatiotemporal location of the utterance. (Suppose, for example, you get a phone call from a friend, but you have no idea where your friend is calling from, or that you hear a call of “Help me!” from a voice you recognize, but you cannot tell where the call is coming from.) On Burks’s view, then, it follows that one can understand an utterance of “I” without fully grasping its indexical meaning.

Burks’s suggestion that a complete semantic theory of indexical expressions may require appeal to two distinct kinds of meaning is important. As we see later, David Kaplan’s influential theory of indexicals develops a related suggestion in a systematic way.

e. Objections to Utterance-based Theories

The theories of Reichenbach and Burks (and probably Russell as well) are clear cases of what was called, in the introduction to this section, utterance-based semantic theories of indexicals. There are two influential objections to utterance-based theories. The presentation of the objections will focus on Reichenbach’s theory, because the technical details of Reichenbach’s theory are worked out to a sufficient degree that the force of the objections is most easy to see.

One important objection to utterance-based theories generally is due to David Kaplan. According to Kaplan, utterance-based theories do not provide adequate resources to explain the logical properties of indexicals and demonstratives. According to Kaplan, an adequate semantics for indexicals should explain the logical truth of a sentence like (7):

(7) If today is Monday, then today is Monday.

Yet given an utterance-based semantics, it is unclear how to do so. On Reichenbach’s analysis of indexicals, let u be some utterance of (7), and let t1 and t2 be the two tokens of “today” produced (or used) in u. According to Reichenbach, the truth conditions for u are given by (8):

(8) If the day on which t1 is produced is Monday, then the day on which t2 is produced is Monday.

Not only is (8) not logically true, it could even be false. Suppose that u were performed right around midnight, slowly enough that t1 was produced at 11:59 PM on Monday, and t2 at 12:01 AM on Tuesday. In this case, (8) is false. The same problem arises for the argument

(9) Today is Monday; therefore, today is Monday.

This looks like it should be a valid argument—it appears to have the form p; therefore p. Yet there are utterances of it on which the utterance of the premise is true, while the utterance of the conclusion is false.

A separate problem for utterance-based theories is that a semantic theory for a language should provide an interpretation of every sentence of the language. Yet on utterance-based theories such as Reichenbach’s, sentences containing indexicals receive an interpretation only upon being uttered. In the absence of an utterance of a sentence, Reichenbach’s theory offers no interpretation of it. Given the recursive structure of language, there are sentences that are too long to be uttered by any individual, and hence sentences that never receive any interpretation on Reichenbach’s theory. (For a discussion of and response to both of these objections to utterance-based theories of indexicals, see García-Carpintero.)

3. Kaplan’s Semantic Theory of Indexicals

We now turn to Kaplan’s influential theory of indexicals. Unlike the theories introduced in the previous section, Kaplan’s is an expression-based semantic theory. Kaplan does not take the objects of semantic evaluation to be utterances or tokens. Rather, Kaplan considers the expressions (types) themselves relative to contexts. On Kaplan’s theory, contexts are abstract formal structures that represent certain features of an utterance. As a result, the objects of semantic evaluation on Kaplan’s theory are abstract objects—expressions relative to contexts—rather than concrete physical objects (tokens) or particular events (utterances).

When discussing Kaplan’s theory, one must be careful: there are two different theories attributed to Kaplan on the basis of what he says in “Demonstratives”. We begin by introducing one of these theories. In section 5, when we discuss Kaplan’s logic of demonstratives, we introduce the other theory, and give reasons to prefer the first theory. It is this first theory that we refer to as “Kaplan’s (semantic) theory”.

a. Background and a Basic Insight

Kaplan’s semantic theory of indexicals is embedded in a general picture of the nature of meaning. In order to understand the significance of Kaplan’s theory, it is important to grasp this picture. According to this picture, the meaning of a sentence S—in the sense of the information encoded by S—is a complex, structured entity whose constituents are the meanings of the sub-sentential expressions (words and phrases) that occur in S, and whose structure is determined by the structure of S. This structured entity is called the proposition expressed by S. It is common to represent structured propositions using ordered n-tuples. For example, the sentence “Tally is a dog” expresses the proposition that we can represent using the ordered pair below:

BEING A DOG, Tally

(It is convenient to talk about ordered pairs, or more generally ordered n-tuples, like this one as being the proposition expressed by “Tally is a dog”, and we will follow this practice. It is important to keep in mind, however, that this is merely a convenience: strictly speaking, a structured proposition is not an n-tuple, and the n-tuple merely represents or stands for the proposition.) The constituents of this proposition are Tally and the property of being a dog. These are the meanings of the significant constituents of the sentence “Tally is a dog”: Tally is the meaning (or referent) of “Tally”, and the property of being a dog is the meaning of the predicate “is a dog”. The structure of the proposition reflects the fact that the sentence “Tally is a dog” is the result of putting the name “Tally” together with the predicate “is a dog”. The sentence “Lassie is a dog” would express a different proposition:

BEING A DOG, Lassie

It is common to refer to these propositions using the complex “that”-clauses “that Tally is a dog” and “that Lassie is a dog”. respectively. (The “that” in these clauses is not a demonstrative pronoun; it is what linguists call a “complementizer”.)

This picture of propositions as complex structured entities that contain objects and properties as constituents is due originally to Bertrand Russell, from his Principles of Mathematics, and it is currently a subject of significant controversy in the philosophy of language. Kaplan’s semantic theory of indexicals is one of the primary reasons many philosophers today embrace this Russellian picture of propositions.

Kaplan’s main contributions to the semantics of indexicals are (i) the recognition of a distinct kind of meaning, clearest in the case of indexicals like “I”, and (ii) a formal theory that explains how the different kinds of meaning are related to each other and to logic, linguistic competence, and language use. To understand Kaplan’s basic insight, consider two utterances of (10), one utterance by Barack Obama, and the other by Hilary Clinton:

(10) I am flying.

Two observations are immediate here: (i) Obama and Clinton have said or asserted different things—Obama has said of himself that he is flying, while Clinton has said of herself that she is flying—and (ii) the sentence that both Obama and Clinton have uttered means the same thing in both cases. Furthermore, these two observations are related: it is because (10) means what it does, and means the same thing when Obama utters it as it does when Clinton utters it, that Obama and Clinton can each use (10) to say different things.

The traditional notion of a proposition, as captured in the Russellian picture of propositions sketched above, applies to what is said or asserted. On this picture, Obama and Clinton have said or asserted different propositions. So the Russellian picture by itself does not offer any account of the meaning of (10) that remains constant across its different uses. This is where Kaplan’s first contribution comes in.

b. Character, Context, and Content

Kaplan calls the meaning of an expression that stays constant across different contexts of use its character. In Kaplan’s theory, character plays two fundamental roles: (i) the character of “I” is what a competent speaker of English knows in virtue of being competent with “I”; and (ii) the character of an expression is a rule or function whose arguments are contexts, and whose value for any context is what Kaplan calls the content of the expression relative to the context.

The character of “I”, for example, is a function whose value, for any context c, is what Kaplan calls the agent (cA) of c (the speaker or writer of the context). The agent cA of a context c is thus the content of “I” relative to c. A language user who is competent with “I” knows this rule, and it is this knowledge, together with information about a context, that allows a language user to figure out who “I” refers to relative to the context.

Generalizing from this example, we arrive at the following theory of meaning: character and content are two different kinds of meaning had by expressions of a language. In virtue of its character, each expression has a content relative to a context. Different kinds of expressions are assigned different kinds of contents relative to contexts. The content of a singular term like “I” relative to a context is an object or individual. The content of an n-place predicate relative to a context is an n-place property or relation. The content of a sentence relative to a context is a structured, Russellian proposition, whose constituents are the contents, relative to the same context, of the atomic expressions (words or phrases) occurring in the sentence.

Some expressions have a character that yields the same content relative to every context. The character of “Barack Obama”. for example, determines the same individual—Barack Obama—relative to every context. Other expressions have a character that yields different contents relative to different contexts. This is the characteristic feature of indexicals, and it is inherited by any expression that contains an indexical. Thus, we may talk not only about the indexicals “I”, “now” and “here”. but also about indexical phrases and sentences. An example of an indexical sentence is (10) (repeated).

(10) I am flying.

In virtue of the character of (10), the content of (10) relative to a context in which Barack Obama is the agent is the structured proposition

FLYING, Barack Obama,

yet relative to a context in which Hilary Clinton is the agent, the content of (10) is the structured proposition

FLYING, Hilary Clinton.

These propositions differ in what is contributed, relative to the different contexts, by the indexical “I”. The content of “I” relative to the first context is Barack Obama; the content of “I” relative to the second context is Hilary Clinton.

In addition to an agent cA to serve as the content of “I”, each context c of Kaplan’s theory includes a time cT to serve as the content of “now”, a location cP to serve as the content of “here”, and a possible world cW to serve as the content of “actually”. Thus, the sentence “I am located here”, relative to a context c, expresses the structured proposition

BEING LOCATED AT, cA, cP〉〉.

In this case, BEING LOCATED AT is a two-place relation between objects or individuals and locations, and the proposition predicates this relation of the agent and location of the context c. This captures the clear intuition that a speaker who utters “I am located here” says of himself or herself that he or she is at the location of the utterance. Additional parameters may be added to contexts as needed by different indexicals (see the discussion of true demonstratives in section 4), but Kaplan’s original theory focuses on the four above. Thus for most purposes, each context c of Kaplan’s theory can be identified with the quadruplecA, cP, cT, cW.

c. Truth Relative to a Context

Kaplan’s theory also provides the resources for defining truth (and falsehood) for sentences relative to contexts. The underlying, natural idea is that if Saul Kripke utters (11), the sentence, as Saul Kripke has used it, is true in virtue of two facts: (i) relative to the context of Kripke’s use (in which Kripke is the agent), (11) expresses the proposition that Saul Kripke is a philosopher, and (ii) Saul Kripke is a philosopher:

(11) I am a philosopher.

In other words, (11), as Saul Kripke has used it, expresses a proposition that is true at the world in which Saul Kripke has used it (in this case, the actual world).

(To say that a proposition p is true (or false) at a possible world w is just to say that p would be true (false) were w actual. For example, let w be a possible world in which Barack Obama lost to Mitt Romney in the November 2012 presidential election. The proposition that in 2014, Barack Obama is president is false at w, because if w were actual, Barack Obama would not be president.)

Kaplan’s definition of truth (falsehood) for a sentence relative to a context develops this natural idea as follows: a sentence S is true (false) relative to a context c if and only if the content of S relative to c (the proposition expressed by S relative to c) is true (false) at the world cW of c. Thus, the sentence “I am Saul Kripke” is true relative to any context in which Saul Kripke is the agent, but false relative to any context in which Saul Kripke is not the agent.

There are two features of Kaplan’s definition of truth relative to a context worthy of further attention. The first is that sentences have truth values relative to contexts and worlds. This observation is more general than the definition of truth relative to a context. Given any context c and world w, we can assign a truth value to a sentence S relative to c and w: it is just the truth value at w of the proposition expressed by S relative to c. Because each context c uniquely determines a world c (the world of the context), there are two distinct possible world parameters relevant to assigning a truth value to a sentence S—the world cW of the context c relative to which S expresses a proposition, and the world w at which we evaluate the proposition expressed by S relative to c. This is an example of double indexing, which was recognized before Kaplan’s work as necessary for the treatment of indexicals. (For an early and influential discussion of double indexing for the indexical “now”, see Kamp.)

Double indexing applies not only to sentences but to singular terms and predicates as well. Just as a sentence is assigned a truth value relative to a context and a possible world, so a singular term (either a proper name or a definite description) is assigned a denotation relative to a context and a possible world, and an n-place predicate is assigned an extension (a set of n-tuples) relative to a context and possible world.

The second important feature of Kaplan’s definition of truth relative to a context is that the second possible world parameter is the world of the context. Again, if we focus just on the possible world parameter of a context, this means that the world cW of the context c is playing two roles in the definition of truth relative to a context c: in one role, it represents the world at which a sentence is uttered or used, and relative to which the sentence expresses a proposition. In the other role, it represents the (actual or counterfactual) circumstance relative to which we evaluate the proposition expressed. This was implicit already in the intuitive statement above of the underlying idea that Kaplan’s definition seeks to capture: (11), as Saul Kripke has used it in the world in which he has used it, expresses a proposition that is true at the world in which he has used it. The two occurrences in the previous sentence of the phrase “the world in which he has used it” reflect the two roles played by the world cW of the context c in Kaplan’s formal definition of truth relative to c.

One of Kaplan’s most significant philosophical insights was to recognize the difference between these two roles. To help keep these distinct roles clear, Kaplan introduced the phrase circumstance of evaluation to refer to the second role played by the world parameter in the definition of truth for a sentence relative to a context. This allows us to restate Kaplan’s definition as follows: a sentence S is true relative to a context c if and only if the content of S relative to c (the proposition expressed by S relative to c) is true at the circumstance of evaluation cW determined by c.

(The circumstance of evaluation in Kaplan’s formal definition includes the time cT of the context as well, but this (i) raises questions about the metaphysics of propositions that are better addressed elsewhere, and (ii) would make the ensuing discussion more complicated without compensatory benefits.)

Again, this feature of Kaplan’s definition of truth relative to a context generalizes to singular terms and predicates. A singular term t denotes an object o relative to a context c and circumstance of evaluation cW determined by c, if and only if t denotes o relative to c full stop. An n-place predicate Pn has an extension E relative to c if and only if E is the extension of Pn relative to c and the circumstance of evaluation cW determined by c.

d. Indexicality and Modality

The importance of the distinction between context and circumstance of evaluation is particularly clear when we consider sentences containing both indexicals and modal operators like “necessarily” or “possibly”. On the standard semantic treatment of the modal operators, sentences are true or false only relative to a possible world. A sentence like (12) is true relative to, or at, a world w if and only if there is a possible world w* (accessible from w) such that (13) is true at w*:

(12) Possibly, Barack Obama is president.

(13) Barack Obama is president.

The modal operator “possibly” in (12) serves to shift the possible world parameter of evaluation: the truth of (12) at a world w depends on the truth of (13) at some other world w*. (Strictly, w* could be identical with w, but it need not be.)

When we turn to indexical sentences, however, there are two possible world parameters relative to which such sentences are true or false: the world of the context and the circumstance of evaluation. Which parameter does the modal operator shift?

One way to approach this question is to ask what we mean when we say that a sentence S is true at a possible world w. One thing we could mean by this is that if one were to utter S in w, then one would say something true. On this account, to say that S is true at all possible worlds is to say that no matter what world one was in, if one uttered S in that world, one would say something true. But this is highly implausible. If Robby the Ranger utters (14) in this world, then Robby says something true, because “Yellow-Yellow” refers to a notorious bear that lived in the Adirondacks in the early 2000s:

(14) If Yellow-Yellow exists, then Yellow-Yellow is a bear.

But in another possible world, the name “Yellow-Yellow” might refer to a raccoon. So were Robby to utter (14) in this other possible world, what Robby said would be false. Thus on this view, (14) is not true at every possible world, and hence (15) is false:

(15) Necessarily, if Yellow-Yellow exists, then Yellow-Yellow is a bear.

But most philosophers, persuaded by Kripke, would reject this conclusion: if Yellow-Yellow was a bear, then she was essentially a bear. (See Kripke for a defense of the existence of essential properties.)

There is an alternative interpretation of what we mean when we say that a sentence S is true at a possible world w. On this interpretation, we consider what S says in the actual world (or what someone who uttered S would strictly and literally say), and we evaluate what S says for truth or falsehood at w. More carefully: a sentence S is true at world w if and only if the proposition actually expressed by S is true at w. On this interpretation, then, evaluating “Necessarily, S” requires first determining the proposition actually expressed by S, and then evaluating this proposition at every possible world. This yields the intuitively correct result for the sentence “Necessarily, if Yellow-Yellow exists, then Yellow-Yellow is a bear”. This is true if and only if the proposition actually expressed by “If Yellow-Yellow exists, then Yellow-Yellow is a bear” is true at every possible world. But this proposition is vacuously true at worlds where Yellow-Yellow does not exist, and if Kripke is correct that Yellow-Yellow is essentially a bear, then this proposition is also true at every world where Yellow-Yellow does exist.

As we saw in our discussion of Kaplan’s definition of truth relative to a context, the role of the circumstance of evaluation is to be the world relative to which the proposition expressed by S relative to a context is evaluated. Thus, the intuitive reflections on what we mean when we say that a sentence S is true at a world suggest a clear answer to the question from two paragraphs back: modal operators like “necessarily” and “possibly” shift the circumstance of evaluation, not the world of the context.

This answer garners further support from our intuitions about sentences containing “actually”. Because “actually” is an indexical, its interpretation relative to a context c is determined by the parameters of the context. In the case of “actually”, the relevant parameter is the world cW of the context. Thus, if modal operators shifted the world of the context, then they would shift the interpretation of the modal indexical “actually”, but intuitively they do not. Kaplan’s famous example of this is (16):

(16) It is possible that in Pakistan, in five years, only those who are actually here now are envied.

In this sentence, “actually” is within the scope of “it is possible that”. So if “it is possible that” shifts the world of the context, then the value of “actually” would be shifted. But it is not. Suppose Kaplan utters both (16) and (17):

(17) Only those who are actually here now are envied.

It is clear that in both cases, Kaplan’s use of “actually” picks out the same world—the world in which he performs both utterances. The only alternative is that the modal operators “possibly” and “necessarily” shift the circumstance of evaluation. More precisely, for any context c and possible world w,

[Necessarily ϕ] is true relative to a context c and world w if and only if, for every possible world w* (accessible from w), ϕ is true relative to c and w*.

One of Kaplan’s central theses about indexicals in English is that there can be no operator that shifts contexts or parameters of contexts in the way that an operator like “necessarily” shifts the circumstance of evaluation. Kaplan calls such operators monsters. The claim that natural language does not include monsters is a matter of debate in current philosophy and linguistics. (For a sophisticated discussion of monsters in linguistics, see Schlenker.)

There is one final observation worth noting before we leave this section: it is important to recognize that “actually” is both an indexical, receiving a value from the context, and a modal operator. As a modal operator, “actually” serves to shift the circumstance of evaluation in the definition of truth relative to a context. But unlike “necessarily” or “possibly”, “actually” always shifts the circumstance of evaluation to the world of the context. Thus on Kaplan’s theory, for any context c and any possible world w, the rule for “actually” is as follows:

[Actually ϕ] is true relative to c and w if and only if ϕ is true relative to c and cW.

One consequence of this rule is that for any context c and sentence S, if S is true relative to c, then so are both “Actually S” and “Necessarily actually S”. (For more discussion of this consequence, see section 5.)

e. Some Consequences of Kaplan’s Theory of Indexicals

There are several consequences of Kaplan’s theory, as laid out thus far, worth noting:

Indexical singular terms like “I” are directly referential.

A singular term is directly referential if and only if its semantic content relative to a context—what it contributes to the propositions expressed in that context by the sentences in which it occurs—is just the object or individual to which it refers. Thus, it is an immediate consequence of Kaplan’s theory that “I” is directly referential, since the semantic content of an indexical singular term like “I” relative to a context is just the agent of the context. Relative to a context, “I” directly refers to the agent of the context.

The thesis that there are directly referential singular terms is in stark contrast to the Fregean view of language, according to which the content of an expression is always a sense—a mode of presentation of an object, property, or proposition.

Indexical singular terms are rigid designators.

The concept of a rigid designator was introduced into philosophical and semantic discussions by Saul Kripke. According to Kripke, an expression e rigidly designates an object o if and only if e designates o in every possible world in which o exists, and does not designate anything else in any world in which o does not exist. To apply the concept of rigid designation to indexical singular terms, however, we need a definition of rigid designation relative to a context. The following definition is somewhat technical, but it correctly captures Kripke’s notion within a semantics for indexical expressions:

Rigid Designation Relative to a Context:

An expression e rigidly designates an object o relative to a context c if and only if for every possible world w, any predicate F, and any object x distinct from o, if o exists at w, then the proposition expressed relative to c by [e is F] is true at w if and only if, in w, o has the property expressed by F relative to c, and if o does not exist at w it is not the case that the proposition expressed by [e is F] is true at w if and only if, in w, x has the property expressed by F relative to c.

The indexical singular term “I”, for example, is a rigid designator relative to any context c according to this definition. Relative to c, the sentence [I am F] expresses the proposition

F-hood, cA.

This proposition is true at an arbitrary world w if and only if cA has the property F-hood in w. Thus relative to c, “I” rigidly designates cA.

Note that in the above example, we do not have to specify whether cA exists at w. This shows that directly referential terms are rigid designators in a particularly strong sense. A directly referential term designates the same object in all possible worlds, whether the object exists at that world or not. (Nathan Salmon, in Reference and Essence, calls such terms obstinately rigid designators.) This is because a directly referential expression contributes the object that it designates to the propositions expressed by sentences in which it occurs; the object is a constituent of the proposition. Any such proposition—one that contains an object or individual as a constituent—is called a singular proposition. Speaking loosely, when we evaluate a singular proposition for truth or falsehood at a possible world w, the singular proposition “brings along” with it the objects that are its constituents. Thus, directly referential terms automatically rigidly designate the objects or individuals to which they refer.

For any definite description [the x: Fx] that uniquely designates an object o relative to a context c, the definite description [the x: Actually Fx] rigidly designates o relative to c.

This consequence of Kaplan’s theory is a corollary of the observations about “actually” at the end of the previous section. Relative to any context c and possible world w, [the x: actually Fx] designates the unique object o that “is F” in the world cW of c, if o exists in w. This is because of the effect of “actually”, which shifts the circumstance of evaluation to the world of the context. Thus, if [the x: Fx] designates o relative to c, [the x: actually Fx] designates o, relative to c, in every world w in which o exists, and does not designate anything else in any world w in which o does not exist.

This consequence of Kaplan’s theory is significant for one of the classic debates in the philosophy of language: the debate over the meaning of proper names. Ever since Saul Kripke’s Naming and Necessity, philosophers and linguists have recognized that proper names, such as “David Kaplan”, in natural languages such as English are rigid designators. Kripke and others take this semantic feature of proper names to be a major objection to the analysis, inspired by Frege and Russell, of proper names as definite descriptions (in Fregean terms, a definite description gives the sense of a proper name). Suppose we analyze the name “David Kaplan” as the definite description “the author of the most important work on indexicals and demonstratives in the 20th century”. Then in some possible world in which Wittgenstein wrote the most important work on indexicals and demonstratives in the 20th century, the name “David Kaplan” would designate Wittgenstein. Thus on this proposal, the name “David Kaplan” is not a rigid designator.

Some philosophers, however, have responded by modifying the Frege-Russell view: if proper names are analyzed as definite descriptions that have been rigidified by adding “actually”, then Kripke’s observation that proper names are rigid designators is just what we would expect. Other philosophers in turn have rejected this modification on various grounds. (For discussion, see chapter 2 of Soames, Beyond Rigidity.)

4. True Demonstratives

So far, we have discussed Kaplan’s semantic theory of pure indexicals—those expressions whose content is uniquely determined relative to a context by basic features of the context (like the agent, time, location, and world). As we noted in the introduction, however, there are also context-sensitive expressions for which these basic features of context are not sufficient to uniquely determine a content relative to a context. These are the true demonstratives. The paradigm examples are the singular demonstrative pronouns “this” and “that”. Except toward the end of this section, I will focus exclusively on “that”.

a. Two Challenges Posed by True Demonstratives

There are several challenges in spelling out a formal theory of true demonstratives. Two of the most important are (i) how to account, in the theory, for the role of whatever is required in a context (gestures, intentions, and so forth) to fix the reference of a particular use of a demonstrative, and (ii) that distinct occurrences of the same true demonstrative can differ in content relative to the same context.

These challenges are related: on an intuitive level, it is because true demonstratives require some further supplementation from the context that distinct occurrences of the same demonstrative can refer to different things. If I point first at the Washington Monument, and then at the Capitol Building while I utter (18), I have said that the Washington Monument is taller than the Capitol Building, and I have done so because there is something in the context that fixes the reference of my first use of “that” as the Washington Monument, and something in the context that fixes the reference of my second use of “that” as the Capitol Building:

(18) That is taller than that.

These observations about true demonstratives pose a problem for Kaplan’s theory as we have stated it thus far: if the meaning of a demonstrative is its character, and the character of an expression is a function that returns the same content whenever applied to the same context, then there is no way for distinct occurrences of a true demonstrative to differ in content relative to the same context. Any attempt to accommodate true demonstratives into Kaplan’s theory must address this problem.

b. Reference Fixing for True Demonstratives

In order to address the first of the two challenges above posed by true demonstratives—that of how to incorporate into the formal theory whatever is required to fix the reference of a particular use of a demonstrative—we must first determine what in fact fixes the reference of a use of demonstrative. There are many different theories, but most fall into one of two categories: the reference of a particular use of a demonstrative is fixed (i) by an associated gesture, or (ii) by an associated intention.

In “Demonstratives,” Kaplan defends a theory of the first kind. For Kaplan, a demonstration is the way that an object that has been singled out in some way (often, but not always, by an act of pointing) appears or is represented from a particular perspective. Kaplan calls this theory the Fregean Theory of Demonstrations. On the Fregean theory, demonstrations have three qualities in virtue of which they closely resemble (pure) indexical definite descriptions: (i) a demonstration determines a mode of presentation of an object (so that different demonstrations may be demonstrations of the same object), (ii) a particular demonstration d might have picked out a different object from the object that it in fact picks out, and (iii) a particular demonstration d might pick out no object at all (in the case of an illusion or hallucination, for example). The Fregean Theory of Demonstrations provides a natural account of the example above, in which I point at the Washington Monument and at the Capitol Building. In the example, the Washington Monument is singled out visually by my first pointing gesture as the object that I am referring to with my first use of “that”, and the Capitol is singled out visually by my second pointing gesture as the object that I am referring to with my second use of “that”.

One virtue of the Fregean Theory of Demonstrations is that it provides an account of why certain uses of demonstratives are informative, while others are not. This is illustrated by a famous example due to John Perry (in his influential article “Frege on Demonstratives”): suppose that we can see both the bow and stern of the aircraft carrier USS Enterprise in harbor, but the middle of the ship is hidden behind a tall building. Now suppose that I point first at the bow, and then at the stern, while uttering (19):

(19) That is identical to that.

My utterance is informative. But suppose instead I had pointed twice at the bow while uttering (19). My utterance in this case would not be informative. According to the Fregean Theory of Demonstrations, the demonstrations in my second utterance present the USS Enterprise in the same way, yet my demonstrations in my first utterance present the USS Enterprise in two different ways. It may be informative to be told that the object presented in one way is identical to the object presented in another way, but it is not informative to be told that the object presented in one way is identical to the object presented that same way. (Observations like this provide one way that Kaplan can respond to the criticisms discussed below in section 6a.)

One problem with gesture-based views generally is that there are uses of demonstratives that are not associated with any gestures at all. Upon seeing a bright flash through the window, I might ask my wife, “what was that?” without needing to perform any gesture at all. If I perform no gesture, then on any theory according to which the reference of my use of “that” is fixed by my gesture, my use of “that” in this example will not refer to anything. This is the wrong result: my use of “that” clearly refers to the bright flash.

This problem with gesture-based views suggests that an intention-based view is superior. But it is important in proposing or defending an intention-based view that one specifies precisely which intention one thinks is significant for fixing the reference of a use of a demonstrative. A speaker who uses a demonstrative may have several intentions: to point at a particular object o, to refer to o, to refer to the object at which he or she is pointing, and so forth. There may be cases in which these intentions do not single out the same object. For example, I may intend both (i) to refer to an object o, and (ii) to refer to the object at which I am pointing. But if I am in fact pointing at some object o* distinct from o, then these two intentions will determine distinct objects.

Philosophers who argue about different theories of reference-fixing for demonstratives often use such cases as data: suppose theory A says that the reference of a use of a demonstrative is fixed by the speaker’s intention α, and theory B says that the reference of a use of a demonstrative is fixed by the speaker’s intention β. Suppose further that there is some case in which a speaker uses “that,” and in which the speaker’s intention α uniquely determines an object o1, and the speaker’s intention β uniquely determines an object o2. Finally, suppose that it is clear in the case in question that the speaker has succeeded in referring with her use of “that” to o2. This is evidence in favor of theory B over theory A.

In his later essay “Afterthoughts,” Kaplan rejects the Fregean Theory of Demonstrations in favor of a view according to which the reference of a use of a demonstrative is fixed not by a pointing gesture, but by the intention that directs the pointing gesture. Kaplan calls these directing intentions. Thus while on the later Kaplan’s view, the reference of a use of a demonstrative is fixed by an intention, that intention is still associated in some way with a speaker’s gestures: if one chooses not to perform a gesture, then one has no intention to direct a gesture at any individual. As a result, it is unclear whether this view successfully avoids one of the central problems with gesture-based views.

Other intention-based accounts may avoid this problem. According to Kent Bach, for example, the reference of a speaker’s use of “that” is the object determined by the speaker’s referential intention. On Bach’s view, a referential intention has a special reflexive structure: a speaker intends the audience to identify, and to take themselves to be intended to identify, some object or individual as the object the speaker is referring to by thinking of that object in a particular way. If the speaker performs some kind of pointing gesture, then the speaker may intend for the audience to think of the object in question as the object that the speaker is pointing at. In other cases, however, the speaker may intend for the audience to think of the object in question in other ways. (Two classic papers in the debate over demonstrative reference fixing are Marga Reimer’s “Do Demonstrations have Semantic Significance?” and Kent Bach’s “Intentions and Demonstrations”.)

c. Adding True Demonstratives to Kaplan’s Theory

In “Demonstratives,” Kaplan considers two ways of adding demonstratives to his theory. The first requires adding an artificial word, “dthat” to the language, via the following rule: if α is a singular term or definite description, then ⌈dthat [α]⌉ is a singular term. Examples from English include “dthat [the current president of the United States]”, “dthat [Saul Kripke]”, and “dthat[the ice cream cone I ate today]”.

The semantics for “dthat” is such that relative to a context c, the content of ⌈dthat[α]⌉ is the object denoted by α relative to c. For example, relative to a context c such that cW is the actual world and cT is noon on January 31, 2013, the content of “dthat[the current president of the United States]” is Barack Obama, because at noon on January 31, 2013 in the actual world, Barack Obama was president. Thus, “dthat”-terms are directly referential, and hence also rigid designators.

“Dthat”-terms exploit the similarity noted above, in our discussion of the Fregean Theory of Demonstrations, between demonstrations and indexical definite descriptions. In this way, this treatment of demonstratives addresses the problem of multiple occurrences of demonstratives by avoiding it altogether. This is because, for Kaplan, an occurrence of a “dthat”-term corresponds to a use of a demonstrative together with a particular type of demonstration, where the singular term α in ⌈dthat[α]⌉ is playing the role of the demonstration. On this treatment of demonstratives, for example, my utterance of (18) (“that is taller than that”), pointing first at the Washington Monument, and then at the Capitol Building, would be represented by (20):

(20)  Dthat[the object that appears thus-and-so from here] is taller than dthat[the object that appears so-and-thus from here].

In this case, the definite descriptions “the object that appears thus-and-so from here” and “the object that appears so-and-thus from here” represent my two demonstrations. Thus rather than having two occurrences of the same word or phrase, we have two different phrases altogether.

As a tool for investigating the semantics and logic of directly referential expressions, Kaplan’s “dthat” has been very influential. But as a basis for a semantic theory of the English demonstrative pronoun “that”, “dthat” is inadequate. The primary problem with using “dthat” as a model for the English demonstrative pronoun “that” is that we do not judge ourselves to have used two different phrases when we utter “that is taller than that” while pointing at two distinct objects. Yet if the English demonstrative pronoun “that” functioned like Kaplan’s “dthat”, we would have to say that each use of “that” in an utterance of “that is taller than that” is in fact an utterance of a distinct phrase that in some way combines the word “that” with either the pointing gestures performed or some particular intention. This runs counter to our clear intuition that we are using the same word twice to refer to different things. (See Salmon, “Demonstrating and Necessity”.)

The second way that Kaplan considers adding true demonstratives to his theory requires adding an infinite (or sufficiently large) number of distinct subscripted “that”s: “that1”, “that2”, and so forth. Each of these is treated as a distinct word in the language. Then we add to each context c an infinite (or sufficiently long) sequence cD of objects and individuals. Each subscripted “that” is then assigned its own character: for each i, the content of “thati” relative to a context c is the i-th member of the sequence cD. We will call the members of cD the demonstrata of c. For example, let c be the following context:

〈Saul Kripke, Washington DC, August 4th, 2014, @ (the actual world), 〈the Washington Monument, the Capitol Building,…〉〉

In other words, Saul Kripke is the agent cA, Washington DC is the location cP, August 4th, 2014 is the time cT, the actual world is the world cW, and the sequence

〈the Washington Monument, the Capitol Building,…〉

is the sequence cD of demonstrata of c. Relative to c, the sentence “that1 is taller than that2” expresses the structured proposition

TALLER-THAN,the Washington Monument, the Capitol Building〉〉.

This second treatment of demonstratives also avoids the problem of multiple occurrences, because in place of two occurrences of one demonstrative “that”, this theory has occurrences of two distinct terms: “that1” and “that2”. As a result, this theory is subject to an objection similar to that raised above to the treatment of demonstratives using “dthat”: it flies in the face of basic intuitions about the language. One apparently basic feature of English is that it contains a demonstrative pronoun “that” which can be used multiple times to refer to distinct objects. This is inconsistent with the claim that instead of a single demonstrative pronoun “that” there are infinitely many distinct subscripted pronouns “that1”, “that2”, and so forth.

d. David Braun’s Context-Shifting Semantics for True Demonstratives

(This section is more technical than the preceding.) An influential alternative to Kaplan’s two approaches to demonstratives is David Braun’s context-shifting theory of demonstratives. According to this theory, formal contexts include sequences of demonstrata (as on the second of Kaplan’s theories considered above), but formal contexts also include that Braun calls a focal demonstratum. The focal demonstratum of a context is simply one member of the sequence of demonstrata. An example of a formal context on Braun’s view would be

Saul Kripke, Washington DC, August 4th, 2014, @, the Washington Monument, the Washington Monument, the Capitol Building,…〉〉.

This formal context differs from the example above only in that the Washington Monument occurs twice: once as a member of the sequence of demonstrata, and once as the focal demonstratum.

Braun then proposes that the meaning of a demonstrative “that” has two parts. One of these parts is its character. For Braun, the character of “that” is a function that for any context c returns the focal demonstratum of c. The second part of the meaning of “that” is a function that shifts the context in a systematic way: on Braun’s view, the result of applying this function to a context c whose focal demonstratum is the i-th member of the sequence of demonstrata is a context whose focal demonstratum is the i+1-th member of the sequence of demonstrata.

So on Braun’s view, the demonstrative “that” is associated with two functions, each of which applies to the formal contexts of the semantic theory, but which yield very different outputs. The character of “that”, which we can abbreviate as “chthat”, is a function that when applied to a context returns a particular parameter of that context—the focal demonstratum. The shifting function of “that”, which we can abbreviate as “shthat”, is a function that when applied to a context returns another context. Evaluating an occurrence of “that” relative to a context c thus involves two steps: in the first step, we apply the character chthat of “that” to c, to yield the content of the occurrence; in the second step, we apply the shifting function to c, to yield a new context shthat(c). The next occurrence of “that” (if there is one) is then evaluated relative to the new context shthat(c).

Thus on Braun’s view, the proposition expressed by a sentence like (18) (reproduced below) relative to a context c is the proposition that follows it:

(18)      That is taller than that

TALLER-THAN, chthat(c), chthat(shthat(c))〉〉

On this proposal, the content of the first occurrence of “that” in (18) relative to c is just chthat(c)—the result of applying the character of “that” to c. The evaluation of the first occurrence of “that” in (18) then triggers the application of the shifting function. Thus the content of the second occurrence of “that” in (18) relative to c is chthat(shthat(c))—the result of applying the character of “that” to the context that results from applying the shifting function of “that” to c. The difference between a context c and shthat(c) is just a difference in the focal demonstratum. But the character of “that” (chthat) is a function that maps a context c to the focal demonstratum of c. Thus on Braun’s view, chthat(c) is the focal demonstratum of c, and chthat(shthat(c)) is the focal demonstratum of shthat(c). Thus on Braun’s view, the content of (18) relative to c is the proposition that predicates the relation TALLER-THAN of the focal demonstratum of c and the focal demonstratum of the result of applying the shifting function to c (in that order).

An example will help to clarify the significance of Braun’s view. Suppose that c is the context

Saul Kripke, Washington DC, August 4th, 2014, @, the Washington Monument, the Washington Monument, the Capitol Building,…〉〉,

where the Washington Monument is the focal demonstratum. Then shthat(c) is the context

Saul Kripke, Washington DC, August 4th, 2014, @, the Capitol Building, 〈the Washington Monument, the Capitol Building,…〉〉,

where the Capitol Building is the focal demonstratum. Now the proposition expressed by (18) relative to c is

TALLER-THAN, the Washington Monument, the Capitol Building〉〉.

But (i) this is just the result that we want, and (ii) we have achieved this result without abandoning the idea that the meaning of an indexical or demonstrative is its character—a function from contexts to contents.

5. Kaplan’s Logic of Indexicals

In addition to the semantic theory for indexicals and demonstratives discussed above, Kaplan provides an account of the logical properties of indexicals. Kaplan’s logic has been just as influential as his semantic theory. Section 5a sketches the core idea of Kaplan’s logic in an informal way, and discusses two examples of logical truths in Kaplan’s system that have been the focus of some philosophical debate. Section 5b introduces the second semantic theory of indexicals attributed to Kaplan (see the introduction to section 3), and briefly discusses reasons most philosophers prefer the semantic theory introduced in section 3.

a. The Core Idea of Kaplan’s Logic

The core of Kaplan’s logic is the idea that a sentence containing indexicals is logically true if and only if the rules governing the meanings of its indexicals, plus the rules for the logical connectives, ensure that the sentence is true in every possible context, independently of the meanings of the non-logical expressions that occur in the sentence. A simple example is (21).

(21) If I am fond of dogs, then I am fond of dogs.

Since, in every context, the character of “I” will return the same individual in both places in the sentence where it occurs, the antecedent and consequent of (21) will have the same truth value in every context, and thus (21) will be true in every context.

A more interesting example is the sentence

(22) I am president if and only if, actually, I am president.

To see that this sentence is true in every context, let us try to construct a context relative to which it is false. In virtue of the semantics for “if and only if”, this would require some context c such that (23) is false relative to c, while (24) is true relative to c (or vice versa):

(23) I am president.

(24) Actually, I am president.

But reflection on the semantics for “Actually” shows that this cannot occur. If (24) is true relative to c, then (by the definition of truth relative to a context), the content of (24) relative to c is true at the circumstance cW of c. Given the semantics for “Actually” (see section 3d), this is the case if and only if the content of (23) relative to c is true at the circumstance cW of c. Since “Actually” shifts the circumstance of evaluation to the world of the context, it has no effect if the circumstance of evaluation already is the world of the context. But to say that the content of (23) relative to c is true at the circumstance cW of c is just to say that (23) is true relative to c. Thus, (24) is true relative to c if and only if (23) is true relative to c, no matter what context we take c to be.

One reason for interest in this example is that while it is a logical truth, the sentence that results from prefacing it with the modal operator “Necessarily” is not a logical truth:

Necessarily (I am president if and only if, actually, I am president).

Relative to a context c, (25) is true if and only if for every world w, (22) is true relative to c and w (because “Necessarily” shifts the circumstance of evaluation). But (22) is true relative to c and w if and only if (23) and (24) are either both true or/and both false relative to c and w. Yet (24) is true relative to c and w if and only if (23) is true relative to c and cW (because “Actually” shifts the circumstance of evaluation back to the world of the context). Thus the logical truth of (25) turns on the following claim about (23): that its content relative to any context c has the same truth value at every circumstance of evaluation. Let c be a context such that cA is Barack Obama and cW is the actual world, and let w be a world in which Barack Obama never ran for president. Then (23) is true relative to c (the content of (23) relative to c is true at the circumstance of evaluation cW), but (23) is not true relative to c and w (because the content of (23) relative to c is not true at the circumstance of evaluation w).

This result has two related interesting consequences: (i) the rule of necessitation fails in Kaplan’s logic. Necessitation is a rule of inference stating that if ϕ is a theorem of a logical system, then so is “Necessarily ϕ”. Necessitation is a standard rule of inference in modal logic, so its failure in Kaplan’s logic of indexicals is surprising. (ii) There are logical truths in Kaplan’s logic that are not necessarily true. In other words, some logical truths in Kaplan’s logic are contingent.

The significance of the second of these consequences is a matter of debate. Kaplan suggests that examples like (22) are cases of contingent a priori claims: propositions that are knowable a priori but are merely contingent. Yet to argue directly from Kaplan’s example to this conclusion requires the further assumption that logical truths in Kaplan’s logic of indexicals express propositions that are knowable a priori. This is a very important topic in the contemporary philosophy of language. For more discussion, see Soames’ Reference and Description, especially Ch. 4.

Another controversial example of a logical truth in Kaplan’s logic is (26):

(26) I am here now.

Kaplan argues that a logic of indexicals should do justice to the intuition that (26) is, in his words, “universally true”. This is another example of a sentence that is not necessarily true, even if it is true. Wherever Saul Kripke is located, if he utters (26), he says something true, but if Kripke were to utter “It is a necessary truth that I am here now”, he would say something false. He could have been somewhere else.

Yet the status of (26) as a logical truth has proven controversial. The most common objection arises from considering various technologies that we use in communication. Early objections to Kaplan’s claim that (26) is a logical truth pointed to the use of sentences like (27) in recording messages for an answering machine:

(27) I am not here now.

Suppose an individual A records a message on an answering machine for their home phone that says “I am not here now. Please leave your name and phone number, and I will return your call as soon as I can”. Another individual B then calls A’s house when A is not at home, and the answering machine plays A’s recorded message. It does not seem as though there is anything false in A’s message. Yet if (26) is a logical truth, then (27) should be a logical falsehood. Thus, Kaplan’s claim that (26) is a logical truth seems to run afoul of everyday facts about language and communication. For an extended discussion of the significance of examples like this, see Predelli, Contexts.

b. Kaplan’s Other Semantic Theory

The model theory for Kaplan’s logic is a development of the model-theoretic semantics for modal logic introduced by Saul Kripke in the early 1960s. The key insight to Kripke’s semantics for modal logic was the introduction of possible worlds as indices relative to which expressions are assigned extensions (truth values in the case of formulas, objects in the case of singular terms, and sets of ordered n-tuples in the case of n-place predicates). In this framework, the intension of an expression is the function whose value for each possible world is the extension of the expression relative to that possible world. The intension of a sentence, for example, is a function from possible worlds to truth values, where for any world w, the value of the intension of s for the world w is the truth value of s relative to w.

This allows us to assign to each sentence s a set of possible worlds at which s is true. For many philosophers, this set is an obvious and natural candidate for the proposition expressed by s. Extending this idea to Kaplan’s logic, the proposal is that the content of an expression relative to a context is an intension, and hence the proposition expressed by a sentence ϕ relative to a context c (in a model M) is just the set of possible worlds w (and times t, in Kaplan’s formal system), such that

Mcftw ϕ.

Extending this idea further to Kaplan’s semantic theory for indexicals and demonstratives in English, the proposal is that the proposition expressed relative to a context in which Barack Obama is the agent by the sentence

(31) I am flying

is just the set of possible worlds at which it is true that Barack Obama is flying. Similarly, the intension of the singular term “I” relative to this context is the function whose value for any possible world w is just Barack Obama.

This alternative semantic theory is suggested by some of Kaplan’s remarks in “Demonstratives,” and it is the theory that emerges from the formal semantics for the language LD spelled out above. The difference between this alternative semantic theory and the semantic theory attributed to Kaplan in section 3 is the subject of a great deal of contemporary controversy. At issue is the nature of propositions and meaning: according to the theory attributed to Kaplan in section 3, the proposition expressed by a sentence relative to a context is a structured, complex entity that includes as constituents the meanings, relative to the same context, of the words occurring in the sentence. According to the alternative semantic theory sketched immediately above, the proposition expressed by a sentence relative to a context has no such structure or constituents. It is a set of possible worlds.

One reason to prefer the theory attributed to Kaplan in section 3 is that it is only on this theory that we can distinguish between singular terms that are directly referential, and singular terms that are rigid but not directly referential. On the alternative semantic theory sketched in this section, the content of a term relative to a context is an intension: a function from possible worlds to objects. The intension of a rigid designator relative to a context is just a constant function—a function that for any possible world returns the same object. The intension of a directly referential expression is just the same thing. Thus there is, on this alternative semantic theory, no difference in content between a directly referential expression that refers to an object o and a rigid, but not directly referential expression that refers to o. Thus on this alternative semantic theory, the following two terms have the same content relative to any context:

(32) 3

and

(33) the natural number x such that x(x-1)(x-2) = x + ((x-1) + (x-2))

Since both (32) and (33) rigidly designate the number three, the intension of each relative to a context is just the function that for any possible world returns the number three. Thus this alternative theory effaces two obvious differences between (32) and (33): one is a difference in structure, and the other is an intuitive difference between the fact that (32) serves merely to tag a particular number while the definite description (33) picks out the number three in virtue of a particular property of that number. Both of these differences are preserved in a semantic theory according to which propositions have structure that reflects the structure of the sentences expressing them.

6. Objections to Kaplan’s Semantic Theory and Logic

For the reasons immediately above, most philosophers prefer Kaplan’s semantic theory introduced in section 3 to the theory above based more strictly on Kaplan’s logic. The objections to Kaplan’s semantic theory that follow focus on the theory introduced in section 3.

a. Objections to Direct Reference

One of the consequences of Kaplan’s semantic theory noted in section 3e is that indexicals and demonstratives are directly referential: relative to a context, the content of an indexical is just the object that the indexical refers to relative to that context. As a result, for any context c relative to which two indexicals refer to the same thing, the two indexicals have the same content. It follows that two sentences that differ only in so far as one contains one of these indexicals where the other sentence contains the other indexical will express the same proposition relative to any context relative to which the two indexicals have the same content. Suppose, for example, that Saul Kripke sees an individual in a mirror, and gesturing toward the mirror utters (34) and (35):

(34) I am Saul Kripke.

(35) He is not Saul Kripke.

It turns out, however, that Kripke is in fact seeing himself in the mirror, and so has referred to himself with “he”. Relative to this context, the two sentences that Kripke has uttered express the following propositions:

IDENTITY, Saul Kripke, Saul Kripke〉〉

NEGATION, IDENTITY, Saul Kripke, Saul Kripke〉〉〉

(Where IDENTITY is the relation of being the same thing as, and NEGATION is the property of being false (or not true).) The second of these propositions is just the negation of the first. Thus relative to the context of Kripke’s utterance, (34) and (35) express contradictory propositions. If Kripke’s utterances are indicative of Kripke’s beliefs, then it appears to follow that Kripke believes these two contradictory propositions. Yet it is implausible to think that Kripke, careful logician that he is, would believe two obviously contradictory propositions such as these. (For the beginning of a response, see section 4b.)

A second objection to the thesis that indexicals and demonstratives are directly referential concerns the use of demonstrative pronouns “this” and “that” in complex noun phrases, such as “that dog chewing on a stick” or “this city”. Such phrases are usually called complex demonstratives. The standard Kaplanian view of complex demonstratives is that relative to a context c, (an occurrence of) the complex demonstrative “that F” refers to the object assigned to (the occurrence of) it by c, provided that the object satisfies the predicate F relative to c. Examples like (36) cast doubt on this consequence of Kaplan’s theory:

(36) Every hiker of the John Muir Trail remembers that day they stood on the summit of Mt. Whitney.

Sentence (36) contains a complex demonstrative “that day they stood on the summit of Mt. Whitney”. According to Kaplan’s theory, the content of this complex demonstrative relative to a context should be a particular day. Yet it is clear that the proposition expressed by (36) does not contain any particular day as a constituent. The proposition is not about any one particular day, but is instead about how each hiker remembers a different day. This is because the complex demonstrative contains a pronoun, “they”, which is bound by the quantifier phrase “every hiker of the John Muir Trail”.

A third objection to the thesis that indexicals and demonstratives are directly referential also concerns complex demonstratives. This objection is based on the observation that there are uses of complex demonstratives where there is no intuitive reference at all. An example is an utterance of (37):

(37) That first wolf that allowed itself to be domesticated did pretty well.

The speaker of this utterance clearly has no particular wolf in mind of which it would be correct to say that the speaker is referring to that wolf. Rather, the speaker is making a general claim to the effect that whatever wolf first allowed itself to be domesticated did pretty well. Thus in this case, there is nothing associated with the speaker’s utterance—no gesture or appropriate intention—that could serve to fix the reference of the demonstrative “that first wolf that allowed itself to be domesticated”. A semantics for demonstratives according to which they are directly referential cannot account for this: if there is nothing to fix the reference of the speaker’s use of the demonstrative, then according to Kaplan’s theory, the speaker’s utterance should be defective, inviting from the audience the question “which wolf are you referring to”? But the speaker’s utterance is not defective. Kaplan’s theory does not have the resources to explain cases like this.

b. Objections to Kaplan’s Treatment of Contexts

A different source of worries about Kaplan’s theory is his treatment of contexts as sequences of parameters. On Kaplan’s view, a context c can be identified with the sequence

cA, cP, cT, cW,

where cA is what Kaplan calls the agent, cP the location, cT the time, and cW the world of the context, respectively. Indexicals have contents relative to these contexts. This treatment of contexts raises a problem for Kaplan’s semantic theory insofar as one of the basic goals of a semantic theory for a language like English is to determine constraints on what speakers of the language can use words and sentences of the language to strictly and literally say. The problem is how to apply Kaplan’s theory to actual and possible uses of language by speakers. Without some rule or principle that assigns formal contexts of Kaplan’s theory to uses of indexicals by speakers, Kaplan’s theory fails to achieve this basic goal of a semantic theory.

An example of such a principle for assigning contexts to uses is what we can call the naïve view of contexts:

For any utterance u of an indexical or sentence containing an indexical, the semantically relevant or semantically appropriate formal context for u is the sequence 〈cA, cP, cT, cW〉 such that cA is the speaker of u, cP is the location at which u occurs, cT is the time at which u occurs, and cW is the world in which u occurs.

The naïve view yields the correct results for many uses of sentences containing indexicals. If David Kaplan arrives at a party at 8 PM on January 31, 1973, and utters “I am here now!”, what Kaplan has intuitively said is that he is at the party at the time of his utterance. The naïve view would assign to Kaplan’s utterance of this sentence the context

David Kaplan, the location of the party, 8 PM on 1/31/1973, w.

Relative to this context, the content of “I am here now” is the proposition that David Kaplan is at the location of the party at 8 PM on January 31, 1973. This is just what we take David Kaplan to have said. In this way, principles like the naïve view bridge the gap between the formal contexts of Kaplan’s semantic theory and actual and possible uses of indexicals in communication.

One problem with the naïve view of contexts is that the nature of utterances is left underspecified. An example discussed earlier in the section on logic will help to illustrate the issue: if someone records “I am not here now” on their answering machine, and someone else later calls and hears this recorded message, what event counts as the utterance? Is it the act of recording the message, or the act of calling and triggering the replay of the message? Or the event of the replay of the message itself? Written messages generate similar questions. The issues raised by written notes and recorded messages are currently a topic of much debate in the philosophy of language. (For an extended discussion and references, see Predelli.)

c. Objections to Kaplan’s Logic

One final objection to Kaplan’s theory focuses on Kaplan’s logic. This objection focuses on Kaplan’s claims about the logical behavior of indexicals and demonstratives. According to Kaplan’s logic, an argument like (38) is valid, because the conclusion is true relative to any context relative to which the premise is true (trivially, since the premise and conclusion are the same):

(38) It is quiet now; therefore, it is quiet now.

Recall (from section 2e) that Kaplan argued against utterance-based theories precisely on the grounds that such theories predicted that arguments like this are invalid, because there are utterances of (38) in which the utterance of the premise is true, but the utterance of the conclusion false (if it turns suddenly very noisy halfway through the utterance of (38), for example).

Yet some philosophers have argued for precisely the opposite conclusion: that examples like this show that Kaplan’s claims about the logic of indexicals are wrong. A valid argument should provide a kind of epistemic assurance that anyone who uses the argument in reasoning will never be led from truth to falsehood. Yet the example above of the use of (38) appears to show that there are cases in which one who uses (38) in reasoning can be led from truth to falsehood. Thus, (38) does not provide the kind of epistemic assurance that a valid argument should provide.

The effect of this objection is potentially quite radical. If it is correct, then many arguments that at first glance appear to be valid are not valid. Exactly how radical the objection is depends in part on how widespread the phenomenon of indexicality is in natural language. Some philosophers, for example, have argued that quantifier phrases like “every sailor” are context-sensitive in a way very much like traditional indexicals like “I” or “now” are context-sensitive. If this is correct, and the objection currently under consideration holds, then the traditional syllogism (39) is invalid:

(39) Every sailor is human; every human is a mammal; therefore, every sailor is a mammal.

This objection to Kaplan’s logic thus has potentially far-reaching consequences. (For an example of a philosopher who embraces these consequences, see Yagisawa.)

7. Alternatives to Kaplan’s Theory of Indexicals

Recent work on the semantics of indexicals has seen a proliferation of alternatives to Kaplan’s theory. These alternatives usually take one of two forms: (i) theories that reject Kaplan’s appeal to contexts (as formal objects) altogether in favor of a token-reflexive (or utterance-reflexive) semantic treatment of indexicals, and (ii) theories that retain Kaplan’s formal apparatus of contexts and character but propose alternative hypotheses about the meanings of particular indexicals or demonstratives. This section presents the most influential current token-reflexive theory, before turning to a very brief sketch of a handful of alternatives that are within the Kaplanian framework (or something very much like it).

a. John Perry’s Reflexive-Referential Theory

The most developed utterance-based semantics for indexicals is currently John Perry’s “referential-reflexive” theory. The distinguishing feature of Perry’s theory is the suggestion that a single utterance u of a sentence like “I am hungry” is associated with several different kinds of content. Chief among these are (i) the referential content of u, and (ii) the reflexive content of u. In this way, Perry seeks to combine the insights of Reichenbach and Burks and the direct reference semantics of Kaplan into one theory of indexicals.

To illustrate the difference between these two kinds of content, let u be an utterance of “I am hungry” by Saul Kripke. Then according to Perry, the referential content of u is the singular proposition that Saul Kripke is hungry. This is the same content that Kaplan’s theory would assign to the sentence relative to a formal context in which Kripke is the agent. But Perry also recognizes a distinct variety of content expressed by the utterance: the reflexive content of u is the proposition that the speaker of u is hungry. This is (roughly) the content that Reichenbach’s theory would assign to u.

Perry’s theory is based on several subtle distinctions. The first is a distinction between different ways in which the environment of an utterance influences the interpretation of that utterance. Perry calls the environment in which an utterance occurs the context of the utterance, and distinguishes between two roles of context, which he calls “pre-semantic” and “semantic”. (It is important to recognize that Perry’s use of “context” is distinct from Kaplan’s—according to Kaplan, a context is a formal object of a semantic theory; according to Perry, a context is a complex situation that includes an utterance.) Pre-semantic uses of context involve using cues from the context of an utterance to resolve ambiguities, such as knowing whether a speaker who utters

I saw her at the bank

is talking about a financial institution or a riverside, or knowing whether a speaker who utters

I saw her duck under the table

is talking about someone’s attempt to dodge something or instead is talking about someone’s choice of pet. Thus, we use the context of an utterance pre-semantically in order to determine the structure and conventional meanings intended by the speaker of the utterance.

Semantic uses of the context of an utterance include the interpretation of any indexicals uttered in the course of the utterance. Here Perry makes two further distinctions, one between different kinds of features of a context, and one between the different ways indexicals exploit these different features. The first of these is a distinction between what Perry calls narrow context and wide context. The narrow context of an utterance comprises constitutive features of the utterance. Perry takes these to be the agent, time, and location of the utterance. Changing any of these results in a different utterance. The wide context of an utterance is, in effect, everything else that might be relevant to the interpretation of the indexicals uttered in the utterance. One of Perry’s examples of a feature of wide context is the length of the space between a speaker’s hands when a speaker utters

It was yea big.

This is a feature of the context of the utterance that could be changed without resulting in a different utterance. Thus, it is not a component of the narrow context of the utterance. Features of wide context are thus optional in a way that features of narrow contexts are not: there are utterances in which a speaker does not indicate any length of space between his or her hands, but there is no utterance that does not take place at a certain time.

The second distinction Perry makes in his account of semantic uses of context in the interpretation of indexicals is between kinds of indexicals. According to Perry, some indexicals are such that the referential content of an utterance of them is fixed automatically in the context of the utterance in virtue of their meanings. These are very much like Kaplan’s “pure indexicals”. The least controversial example of an automatic indexical in Perry’s theory is the first-person pronoun “I”. Another plausible candidate is the modal indexical “actually”, any utterance of which automatically picks out the world in which the utterance occurs.

In contrast to automatic indexicals, Perry argues that some indexicals are such that the referential content of an utterance of them is determined in part by certain intentions with which the speaker of the utterance utters them. The clearest examples of such intentional indexicals are Kaplan’s true demonstratives: the demonstrative pronouns “this”, “that”, “these”, “those”, and “there”. But Perry also notes that the referential contents of certain utterances of “now” and “here” are fixed by the speakers’ intentions. An example is the use of “now” in an utterance of

The summers are warmer now than they were ten years ago.

The crucial observation here is that the referential content of “now” is a certain (perhaps not totally determinate) span of time, and the length of this span is determined by the speaker’s intentions.

Yet it is also the case that the referential content of any utterance of “now” is constrained in such a way that the span of time to which the utterance refers must include the time of the utterance: a feature of the narrow context of the utterance. This illustrates how Perry’s two distinctions—narrow versus wide context, and automatic versus intentional indexicals—cross cut each other. The result is a fourfold classification of indexicals:

NA      Narrow Context; Automatic Indexical:        “I”, “actually”

NI        Narrow Context; Intentional Indexical:       “now”, “here”

WA     Wide Context; Automatic Indexical:             “tomorrow”, “yea”

WI       Wide Context; Intentional Indexical:            “this”, “that dog”, “there”

The referential content of an utterance of an NA indexical is fixed automatically to some feature of the narrow context. The referential content of an utterance of an NI indexical is fixed by the intentions of the speaker of the utterance, but is constrained in some way by some feature of the narrow context. The referential content of an utterance of a WA indexical is fixed automatically to some feature of the wide context of the utterance. Finally, the referential content of an utterance of a WI indexical is fixed by the intentions of the speaker of the utterance, and only constrained, if at all, by whatever features of the wide context are determined by the speaker’s intentions.

The reflexive content of an utterance of an indexical, on Perry’s theory, is roughly the content that encodes what a speaker who is competent with the indexical has to know about the utterance in virtue of which they are in a position to identify the referential content of the utterance. This is captured in the claim that the reflexive content of an utterance u of “I am hungry” is the descriptive proposition that the speaker of u is hungry. Anyone who overhears u is in a position to understand this reflexive content. But only someone who can identify the speaker of u is in a position to grasp the referential content of u. In this way, Perry attempts to revive Burks’s theory of the indexical meaning of a token (or utterance) as a theory of what a competent language user has to know in order to understand that token.

b. Expression-Based Alternatives

Recent work on the semantics of indexicals and demonstratives has led to a proliferation of alternative proposals. Many of these proposals focus on complex demonstratives (see section 6a). The challenge for theories of complex demonstratives is to accommodate both the examples that support Kaplan’s theory—according to which both simple and complex demonstratives are directly referential—and the examples (some of which were presented in section 6a) in which complex demonstratives behave in ways inconsistent with Kaplan’s theory.

One way to meet the challenge of the range of examples is to propose that complex demonstratives are ambiguous. On this view, it is possible to maintain that the examples of complex demonstratives that support Kaplan’s theory are cases of direct reference, while the other examples are cases in which the complex demonstratives in question have a different semantics. One advantage to such a theory is that it preserves the theoretical elegance and intuitive appeal of Kaplan’s treatment of standard referential uses of complex demonstratives. One disadvantage of such a theory is that positing an ambiguity is often thought of as a cheap solution to a problem. Thus, any philosopher or linguist who wants to defend an ambiguity theory of this sort has to argue that the ambiguity is well-motivated, and not simply a response to recalcitrant examples. (For further discussion of ambiguity theories, see Georgi, 2012.)

A different way to meet the challenge posed by the range of uses of complex demonstratives is to argue that some set of these uses reveals the true semantic nature of complex demonstratives, and then show how to explain the other uses within the framework of the proposed semantics. Two recent proposals along these lines are due to Jeffrey C. King and to David Braun. According to King, the uses of complex demonstratives discussed in section 6a show that complex demonstratives are not directly referential at all. On King’s view, complex demonstratives are quantifiers, like “every dog”, or “some homemade cookie”. The key semantic feature of quantifiers is that their content, relative to a context, is not an object or individual. Rather, the content of a quantifier relative to a context is itself a structured, complex entity, whose components are the contents of the expressions that occur within the quantifier. King defends an elaborate theory of the quantificational meanings of complex demonstratives, and shows how this theory accommodates a wide range of linguistic data.

In contrast to King’s theory, David Braun defends a traditional Kaplanian treatment of complex demonstratives as directly referential. On Braun’s view, the uses of complex demonstratives in section 6a that appear to conflict with Kaplan’s theory can be explained on pragmatic grounds: they are cases in which what a speaker means goes above and beyond what the speaker strictly and literally says. This allows Braun to maintain Kaplan’s theory, and to explain the apparently conflicting data, while rejecting the claim that complex demonstratives are ambiguous.

8. References and Further Reading

  • Bach, Kent. 1992. “Intentions and Demonstrations.” Analysis, 52(3): 140–146.
    • Bach defends the view that demonstrative reference is fixed by the speaker’s referential intentions.
  • Braun, David. 1996. “Demonstratives and Their Linguistic Meanings.” Noûs, 30(2): 145–173.
    • This paper is the source of the influential context-shifting semantic theory of demonstratives.
  • Braun, David. 2008. “Complex Demonstratives and Their Singular Contents.” Linguistics and Philosophy, 31: 57–99.
    • Braun defends a direct reference semantics for complex demonstratives from the objections raised by Jeff King and others.
  • Burks, Arthur W. 1949. “Icon, Index, and Symbol.” Philosophy and Phenomenological Research, 9(4): 673–689.
    • Burks provides both an insightful discussion of Peirce’s original remarks on indexicals and a sophisticated theory of their meaning.
  • García-Carpintero, Manuel. 1998. “Indexicals as Token-Reflexives.” Mind, 107: 529–563.
    • García-Carpintero presents a careful analysis of token-reflexive views of indexicals and defends them from several influential objections.
  • Georgi, Geoff. 2012. “Reference and Ambiguity in Complex Demonstratives.” In William P. Kabasenche, Michael O’Rourke, and Matthew H. Slater (eds), Reference and Referring: Topics in Contemporary Philosophy, v.10. Cambridge, MA: MIT Press, pp. 357–384.
    • Georgi defends a view of complex demonstratives according to which they are ambiguous between referential and non-referential readings.
  • Kamp, Hans. 1971. “Formal Properties of ‘Now.’” Theoria, 37: 237–273.
    • Kamp’s paper is an early and influential discussion of double-indexing, or two-dimensional semantics, as applied to natural languages.
  • Kaplan, David. 1989a. “Demonstratives.” In Joseph Almog, John Perry, and Howard Wettstein (eds), Themes from Kaplan. New York: Oxford University Press, pp. 481–563.
    • Kaplan’s most influential work on demonstratives. Its subtitle says it all: “an essay on the semantics, logic, metaphysics, and epistemology of demonstratives and other indexicals.”
  • Kaplan, David. 1989b. “Afterthoughts.” In Joseph Almog, John Perry, and Howard Wettstein (eds), Themes from Kaplan. New York: Oxford University Press, pp. 565–614.
    • Kaplan provides further reflection on some of the main themes of “Demonstratives.”
  • King, Jeffrey C. 2001. Complex Demonstratives. Cambridge, MA: MIT Press.
    • King presents several powerful criticisms of Kaplan’s direct reference semantics for complex demonstratives, and defends an alternative semantic theory according to which complex demonstratives are context-sensitive quantifiers.
  • Kripke, Saul. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
    • Kripke argues against descriptivist theories of the meaning and reference of proper names and natural kind terms, along the way introducing the definition of rigid designation.
  • Perry, John. 1977. “Frege on Demonstratives.” The Philosophical Review, 86(4): 474–497.
    • Perry argues that indexicals and demonstratives pose a puzzle for the Fregean theory of meaning as sense.
  • Perry, John. 1979. “The Problem of the Essential Indexical.” Noûs, 13(1): 3–21.
    • Perry offers several influential examples in support of the view that indexicals play a privileged role in epistemology.
  • Perry, John. 2001. Reference and Reflexivity. Stanford, CA: CSLI Publications.
    • Perry presents a sophisticated token-reflexive alternative to Kaplan’s semantic theory that contains many insights into the behavior of indexicals.
  • Predelli, Stefano. 2005. Contexts. Oxford: Clarendon Press.
    • Predelli investigates the philosophical foundations of the second kind of semantic theory attributed to Kaplan.
  • Reichenbach, Hans. 1947. Elements of Symbolic Logic. New York: Macmillan.
    • The book contains the original statement of the view that indexicals are token-reflexives.
  • Reimer, Marga. “Do Demonstrations Have Semantic Significance?” Analysis, 51(4): 177–183.
    • Reimer argues that the reference of a use of a demonstrative is fixed by the associated demonstration (or gesture).
  • Salmon, Nathan. 2002. “Demonstrating and Necessity.” The Philosophical Review, 111(4): 497–537.
    • Salmon argues for an alternative treatment of demonstratives within Kaplan’s semantic framework, according to which demonstrations are included in the context.
  • Salmon, Nathan. 2005. Reference and Essence, 2nd Edition. Amherst, NY: Prometheus Books.
    • Salmon investigates the relationship between the theory of direct reference in semantics and essentialism in metaphysics.
  • Schlenker, Philippe. 2003. “A Plea for Monsters.” Linguistics and Philosophy, 26: 29–120.
    • Schlenker presents data supporting the existence of monsters in natural language—specifically in propositional attitude reports—and offers a semantic theory that accommodates these data.
  • Soames, Scott. 2002. Beyond Rigidity. Oxford: Oxford University Press.
    • Soames attempts to consolidate the lessons of Kripke’s Naming and Necessity. Chapter 2 includes a sophisticated discussion of rigid designation and its significance for Kripke’s arguments against descriptivism about proper names.
  • Soames, Scott. 2005. Reference and Description. Princeton: Princeton University Press.
    • Soames presents an analysis and criticism of the approach to meaning called “two-dimensional semantics”, including a careful discussion of Kaplan’s logic and semantics of indexicals.
  • Yagisawa, Takashi. 1993. “Logic Purified.” Noûs, 27(4): 470–486.
    • Yagisawa applies the techniques of Kaplan’s logic to more natural uses of language.

 

Author Information

Geoff Georgi
Email: Geoff.Georgi@mail.wvu.edu
West Virginia University
U. S. A.

An encyclopedia of philosophy articles written by professional philosophers.